Long-term datasets have become a focus for climate research since they are essential for studying the variability and extremes in weather and climate. The overall importance of climate data rescue and efficient climate data management is accepted widely. Computer technologies are essential tools for processing climatological data and storing the huge amount of meteorological measurements (Ananta et al., 2013). National Center of Meteorology and Seismology (Climate department) in UAE aims to gather and store all meteorological data recorded by wide network of weather stations in unified climatological database structure such as: Air Temperature, Air Pressure, Relative Humidity, Solar Radiation and other relevant data (Brandao, 2015). The report is prepared for the project developed using data mining techniques for forecasting the weather conditions, collects all the statistical data of the climate condition, and prepares a yearly report for the specified location. To store and manage climatological data in database server, the project incorporates implementation of CLASYS (Climate Climate Archive System).
CLASYS (Climate Archive System) is a Desktop Architecture that works through a simple user-friendly interface to support the storage and management of all meteorological data. The data is stored and managed in one unified structure based on Microsoft SQL Database Server CLASYS performs data retrieval and analysis. Data that are already in digital form (Electronic format) can be readily ingested directly by the system (Data Import). Non-digital records from the original observation books are generally digitized during an entry process with a minimum of error since accurate climate data is the first prerequisite for basing climate related decision making on. The system is being designed for validation and removal of erroneous data as they are entered or imported. CLASYS detects likely and manual errors automatically by establishing a set of quality control procedures (Chung, Theng & Seldon, 2013). Data in digital form can be exported from the system for further usage such as report making, weather forecasting, and others. In the literature review section, existing climate data storage and management systems are described such as CLISYS (Meteo France – MFI), CLIWARE (Russian Federation), CLIMSOFT (Zimbabwe-Guinea-Kenya-Metoffice), JCDMS (Jordan), CLDB (Slovakia - MicroStep-MIS), and CLICOM (FREE WMO). These systems are demonstrated for showing their limitations, challenges and comparison with each other; so that importance of CLASYS can be established.
The project is developed using data mining techniques, autonomous error-checking of meteorological data, and decision tree algorithm. The Scrum project lifecycle is used for developing the project, and a project schedule is created for scheduling the activity incurred for the development of the project. For the storage of the weather condition data, the algorithm is prepared to combine data mining along with decision tree. CLASYS performs management of climatological data such as manual and automatic check of erroneous data to resolve data flaw and importing external data in many formats.
In this section, the researcher tends to choose over some existing Climate Data management systems. Existing climate data management systems are comparatively demonstrated in the section for studying the limitations and gaps exist in climate data handling process (Cinquini et al., 2014). Identified systems are CLISYS (Meteo France – MFI), CLIWARE (Russian Federation), CLIMSOFT (Zimbabwe-Guinea-Kenya-Metoffice), JCDMS (Jordan), CLDB (Slovakia - MicroStep-MIS), and CLICOM (FREE WMO). In this project, the proposed system is expected to outreach the identified limitations in literature review, so that CLASYS can be stated as important for elimination and mitigation of limitations.
This particular system is effective for understanding and tracking historical climate data in real time. This system is popular for anticipation of weather and recommendations for change of climate. CLISYS can carry out analysis for weather condition prediction so that certain region or area residents can be alerted. Moreover, CLISYS helps as a decision-making tool for residents and users (Desai, 2016; Elliff et al., 2017). It can securely store historic data and real time climate information, stores precise and accurate data in databases, ensures quality controlling in collected data, and can generate substantial reports with statistics.
CLISYS is a web-based architecture with user-friendly interface offering instant access to data management and system administration. It has a unified storage structure helping to ensure centralized databases and unique information storage (CLISYS - CDMS., 2017). CLISYS operates with reliable monitoring system and have capability of storing past, present, and real-time climate data. WMO evaluated CLISYS and declared fully compatible with WMO practices and recommendations. However, Feris, Zwikael and Gregor (2017) opined that CLISYS could not accept several data formats while taking raw data from users. This is identified as a limitation for climate data analysis.
CLIWARE is mainly developed for handling and resolving hydro-meteorological data management while processing in different levels. CLIWARE is developed based on module base so that different climate characteristics can be assessed and important and derived information can be forwarded to client (CliWare System., 2017). CLIWARE can generate hydro-meteorological data, various climatic characteristics, database of hydro-meteorological data and metadata, and sends relevant information to customer. CLIWARE allows to process different operational data such as synoptic data, ship synoptic data, upper-air data, climate data, oceanographic data Batey, cleaver, and buoy (Hobday et al., 2016). System generates climate characteristics and allows user to download from external sources. Russian Federation proposed the tool with web-based architecture alongside database server, application server with J2EE standard, dynamic web server, and client software.
However, Horta, Georgieff and Aschero (2015) suggested limitation identified as CLIWARE cannot detect errors in data. It completely relies on user input through user interface, and then it automatically generates reports, charts, and graphs. Again, the operation plan of CLIWARE is dependent over archived data, past climate datasets from database server. In case, the stored database contains redundancy and errors in climate data; then CLIWARE cannot remove them as well (Hossny et al., 2013). Therefore, these limitations are managed by the end-users so that CLIWARE can operate with suitable efficiency and effectiveness.
This system is a software suite that stores climate data in flexible and suitable manner so that user can extract useful information easily. CLIMSOFT is developed for certain companies who wish to generate reports and analysis over climate in certain regions. CLIMSOFT helps to store the historical climate data in computerized format (Climsoft Home., 2017). CLIMSOFT followed “WMO Climate Data Management System Specifications” and this software suite works properly with climate data. It can apply e-SIAC statistical approach for detailed analysis of weather condition at will of end-users. This software suite is popular in Zimbabwe, Guinea, and Kenya for climate data storage and extraction.
However, Imoto, Carneiro and Avila-da-Silva (2016) claimed that CLIMSOFT is unable to keep collected climate data with security, sometimes, data loss can happen in this software suite. CLIMSOFT cannot accept different formatted data from end-users and it cannot remove flaws from collected climate and weather information.
JCDMS can perform several types of work such as previous paper data key-entry, current Climate Observation data entry, different data format importing, data validation and quality control. JCDMS can manage and re-organize data sets along with archiving, export, and analysis feature (JORDAN METEOROLOGICAL DEPARTMENT AMMAN-JORDAN., 2017). Furthermore, JCDMS can put climate data in usable format; it operates in Windows NT /2000 Server, Workstations WIN95/98/NT/2000, Local area network (LAN), Oracle 8 or higher (RDBMS), and Oracle Tools (Developer 2000). JCDMS requires personnel expertise level to be information technology knowledge, experience in ORACLE RDBMS, ORACLE tools, and climate data management (Kaur & Sengupta, 2013). JCDMS is incorporated with user-friendly, multiple user access, GUI interface, higher storage capacity, Validation and Quality Control and direct connect to GIS.
Main limitations in JCDMS are it can validate the climate data and cannot remove the errors in collected. Therefore, JCDMS is developing error removal process for better and convenient usage.
CLDB is a popular database for Slovakia - MicroStep-MIS; inside the Climatological database (CLDB), the previously mentioned user-friendly structure depends on SQL Database Server (MicroStep-MIS - Climatological and integrated Environmental Database (IMS CLDB and EnviDB)., 2017). Standard information depends on SQL dialect for storage of climate data. The certification of data storage quality is the business demonstrated Oracle Database Server, the server is essential for database innovations. CLDB depends on WMO suggested hones regarding single climatological information preparing (Hobday et al., 2016). It takes after the WMO recommendation of an RDBMS (Relation Database Management System) application with wide use in climatology (World Climate Program endeavours concerning new Climate Data Management Systems - CDMSs). The Extraordinary preferred standpoint of CLDB is measured engineering, which gives end client probability of point-by-point customization. End client can indicate extra nonstandard info and yield modules. Modules can be effortlessly executed and added to any current or future establishment (Franco et al., 2014). A standout amongst the most intriguing expansions offered is moving up to the natural database (radiation and air contamination checking).
CLICOM is a tool for storing climate data, supporting different data inputs, and other additional features. CLICOM stores different types of data such as observed data, meteorological phenomena, inventory of missing data, rainfall gauge measurements, and others (CLICOM., 2017). CLICOM accepts definition of observed data such as temperature, pressure or wind direction. CLICOM can accept manually typed data and climate information so that users can easily put information into analysis. CLICOM enables automatic import from text files for handling climate data with better analysis. Moreover, the main limitation in CLICOM is identified that it cannot accept several formats of data into consideration for analysis.
Clasys Development Importance and Approach
The cost for the development of the project will be required to be estimated before starting the development process. The cost could be calculated after a successful analysis of the requirement and the specification for the project (Mansor et al., 2016). The prediction algorithm could be applied periodically to the data collected from the CLDB and for storing the result in the database of the system. The system should display the values and highlight the areas on the map for which the prediction is made. The system should be prepared with the application of interactive animation and effects for making the interface interactive and the user can get information about the present and the future climate condition of the selected location (Lagerberg et al., 2013). A language preference can also be added to application of language translation tool that can be selected for changing the preference of the user. The information system developed should produce graphs and maps for the climate condition of the selected region. The actors and the goals of the user are also analysed, and it has been found that the main actors are the users (Registered and Unregistered), database, Graph plotter, Historical data provider and the system administrator (Feris, Zwikael & Gregor, 2017). The admin also has the authority of clearing the data that is fetched from the database of CLDB. The admin can log out other users using the system and track the usage and the search pattern of the users using the system.
The project should be prepared after a detailed analysis and research, firstly a high level architecture design in created. The data mining technique is applied to the collection of the historical data and collecting different components for the development of the project. The main components of the project are as follows:
- Collection of Data: The historical data are fetched and stored in the local database for the specific region
- Cleaning of the Data: the fetched data are cleaned, and the data that have missing components are removed (Sokmen & ?ebi, 2017). The duplicate entries are found in the database, and they are deleted for increasing the efficiency of the system
- Selection of the Data: A requirement analysis is done, and the data required for the proper running of the query are analyzed, and the relevant data are retrieved from the system.
- Transformation of data: The gathered data are transformed in this stage that is accepted the form of the data mining.
- Data mining: The algorithms are selected and used for analysing the meteorological datasets that are used for creating interesting pattern for studying (Kaur & Sengupta, 2013).
Problem Statement and Scope
In this project, the primary challenges that should be tackled are identified as to let the system removing erroneous data, data validation, and import external files in many format. Climate department already have climate Database Management System to receive meteorological data from weather stations (Lagerberg et al., 2013). However, in current context, there is no autonomous management of data, automatic error checking in data; henceforth, the climate department personnel had to check and manage data manually. Therefore, the problem in existing climate databases is storing erroneous data without effective validation and checking before storage. In this manner, sometimes, the climate data cannot be managed as per weather forecasting reports (Mansor et al., 2016). Furthermore, climate department needs to input climate data manually in climate databases along with checking data and maintaining specific format. It is quite tiresome to import sensitive climate data manually, checking them and processing them as well. In this project, the problems are identified as following:
- To validate and check climate data automatically
- To remove the incorrect and invalid data from the databases automatically
- To import data from external sources such as document, text files, spreadsheet, and others along with different data formats
Problem scope is defined to be assessing the risks and certain issues that can make the system vulnerable to some consequences. Therefore, to mitigate the risks and issues, proposed CLASYS (Climate Archive System) will contain:
Table for weather stations: This table will contain weather station data such as wind speed, wind direction, air temperature, solar radiation, relative humidity and others (Sampaio Franco et al., 2014). This data will be fed through CLASYS for checking errors and validation before storing them in databases.
Table for variables: To store the collected data, table of variable is created. This table will store the weather data received from the stations temporarily for management. The system administrators have the responsibility of reporting the historical weather data and create status for the created report (Sokmen & ?ebi, 2017). The main functional requirement of the project is to allow the system administrator to add historical data of the weather.
Table for users: The registered users can search the system get a future prediction of the weather condition of the selected location. The alert message can be sent to the user if there is a sudden change in the current weather condition or emergency arises (Turner, 2015). The visitors are given limited functionality and they can check the status of the weather and can register into the system for getting updates at a regular interval of time.
Table for quality control: Quality check of collected data is crucial as management and storage of climate data should be flawless. The system should do erroneous data removal from collected data before storing them in databases. The system can provide better visibility of data so that weather forecasting can be performed with detailed analysis. The graph plotter has the responsibility to plot sketches of the predicted weather on the map. The area where the weather is searched is highlighted using different colors to make the system convenient for usage.
Table for import external data: Table for several formatted data helps to manage different electronically formatted climate data and store them within database regardless of data flaw and loss. CLASYS support many formats to incorporate appropriate handling and storage as well.
Ananta, I., Callaghan, V., Chin, J., Ball, M., & Gardner, M. (2013). Crowd Intelligence in Intelligent Environments: a Journey from Complexity to Collectivity. In Intelligent Environments (IE), 2013 9th International Conference on (pp. 65-70). IEEE.
Brandao, M. C. (2015). Biodiversidade e distribui??o de larvas de invertebrados da plataforma Sudeste-Sul do Brasil (21-34 ?S), com ?nfase em larvas de Decapoda.
Chung, L. T., Theng, L. B., & Seldon, H. L. (2013). A GIS-based environmental health information source for Malaysian context.
Cinquini, L., Crichton, D., Mattmann, C., Harney, J., Shipman, G., Wang, F., ... & Pobre, Z. (2014). The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data. Future Generation Computer Systems, 36, 400-417.
CLICOM. (2017). Clidata.cz. Retrieved 15 March 2017, from
Climsoft Home. (2017). Climsoft. Retrieved 15 March 2017, from
CLISYS - CDMS. (2017). Mfi.fr. Retrieved 15 March 2017, from
CliWare System. (2017). Cliware.meteo.ru. Retrieved 15 March 2017, from
Desai, M. A. (2016). Mutliscale Drivers of Global Environmental Health.
Elliff, C. I., dos Santos Tutui, S. L., Souza, M. R., & Tomas, A. R. G. (2017). Estrutura populacional da carapeba (Diapterus rhombeus) em um sistema estuarino do sudeste do Brasil. Boletim do Instituto de Pesca, 39(4), 411-421.
Feris, M. A. A., Zwikael, O., & Gregor, S. (2017). QPLAN: Decision support for evaluating planning quality in software development projects. Decision Support Systems.
Hobday, A. J., Cochrane, K., Downey-Breedt, N., Howard, J., Aswani, S., Byfield, V., ... & Fulton, E. A. (2016). Planning adaptation to climate change in fast-warming marine regions with seafood-dependent coastal communities. Reviews in Fish Biology and Fisheries, 26(2), 249-264.
Horta, L. R., Georgieff, S. M., & Aschero, C. A. (2015). Chronology of bathymetric variations of the Pueyrred?n-Posadas-Salitroso lacustrine system during the Late Pleistocene to Early Holocene. Quaternary International, 377, 91-101.
Hossny, E., Khattab, S., Omara, F., & Hassan, H. (2013). A case study for deploying applications on heterogeneous paas platforms. In Cloud Computing and Big Data (CloudCom-Asia), 2013 International Conference on (pp. 246-253). IEEE.
Imoto, R. D., Carneiro, M. H., & Avila-da-Silva, A. O. (2016). Spatial patterns of fishing fleets on the Southeastern Brazilian Bight/Patrones espaciales de las flotas pesqueras en Southeastern Brazilian Bight. Latin American Journal of Aquatic Research, 44(5), 1005.
JORDAN METEOROLOGICAL DEPARTMENT AMMAN-JORDAN. (2017). www.wmo.int. Retrieved 15 March 2017, from
Kaur, R., & Sengupta, J. (2013). Software process models and analysis on failure of software development projects. arXiv preprint arXiv:1306.1068.
Lagerberg, L., Skude, T., Emanuelsson, P., Sandahl, K., & Stahl, D. (2013, October). The impact of agile principles and practices on large-scale software development projects: A multiple-case study of two projects at ericsson. In Empirical Software Engineering and Measurement, 2013 ACM/IEEE International Symposium on (pp. 348-356). IEEE.
Mansor, Z., Arshad, N. H., Yahya, S., Razali, R., & Yahaya, J. (2016). Ruler for Effective Cost Management Practices in Agile Software Development Projects. Advanced Science Letters, 22(8), 1977-1980.
MicroStep-MIS - Climatological and integrated Environmental Database (IMS CLDB and EnviDB). (2017). Microstep-mis.com. Retrieved 15 March 2017, from
Sampaio Franco, A. C., Shimada Brotto, D., Zee, W., Man, D., & Neves dos Santos, L. (2014). Long-term (2002-2011) changes on Cetengraulis edentulus (Clupeiformes: Engraulidae) fisheries in Guanabara Bay, Brazil. Revista de biologia tropical, 62(3), 1019-1029.
Sokmen, N., & ?ebi, F. (2017). Decision-Tree Models for Predicting Time Performance in Software-Intensive Projects. International Journal of Information Technology Project Management (IJITPM), 8(2), 64-86.
Turner, E. C. (2015). Evaluating spectral radiances simulated by the HadGEM2 global climate model using longwave satellite measurements.