The Applicability of Big Data in Climate Change Research: The Importance of System of Systems Thinking

The aim of this paper is to provide an overview of the interrelationship between data science and climate studies, as well as describes how sustainability climate issues can be managed using the Big Data tools. Climate-related Big Data articles are analyzed and categorized, which revealed the increasing number of applications of data-driven solutions in specific areas, however, broad integrative analyses are gaining less of a focus. Our major objective is to highlight the potential in the System of Systems (SoS) theorem, as the synergies between diverse disciplines and research ideas must be explored to gain a comprehensive overview of the issue. Data and systems science enables a large amount of heterogeneous data to be integrated and simulation models developed, while considering socio-environmental interrelations in parallel. The improved knowledge integration offered by the System of Systems thinking or climate computing has been demonstrated by analysing the possible inter-linkages of the latest Big Data application papers. The analysis highlights how data and models focusing on the specific areas of sustainability can be bridged to study the complex problems of climate change.


INTRODUCTION
Climate change is a pressing issue of today, for which data-based models and decision support techniques offer a more comprehensive understanding of its complexity. The aim of this paper is to reveal data-based techniques and their applicability in terms of climate researches. More precisely, how can Big Data, through data science answer sustainability climate issues and be applicable in scientific researches and decision sciences in an integrated manner.
The overview is guided through three closely related notions, namely, (1) data science as a novel interdisciplinary field connected to (2) machine learning that is a tool for improving automatic prediction or decision processes, and (3) Big Data which foster processing and connecting large amount of heterogeneous data. The focus point of this research is the interconnectedness of the complex climate-related systems, for which exploration Big Data provides an efficient toolbox.
Research questions formulated three aspects, which answering kept in focus through the whole paper: • How and when Big Data appears in climate-related studies?
• What researches have been made in regard with Big Data applications in climate studies, and how they are structured?
• How to integrate the knowledge accumulated in diverse specific researches?
The year 2015 brought about further excitement in the field of research directions concerning climate change, as the United Nations declared 17 sustainable development goals, of which SDG13 is "Take urgent action to combat climate and its impacts" (UN, 2016) and the Paris Agreement has been signed, that concerning the mitigation of greenhouse gas emissions, adaptation and finance in 2015 with the specific aim of keeping global average temperature rises well below 2 • C above preindustrial levels and then continuing efforts to keep global temperature rises below 1.5 • C above pre-industrial levels, recognizing that this will significantly reduce the risks and impacts of climate change (Rogelj et al., 2016). This kind of organizing principle supports the complex analysis of the classical disciplinary sciences with a holistic, interdisciplinary approach. New types of approaches require much more complex analyses and models and, therefore, several orders of magnitude more data, which brought Big Data to life as a stand-alone scientific discipline. Big Data-based tools are already widespread in this new complex science, for example, to monitor seasonal changes in climate change (Manogaran et al., 2018), understand climate change as a theory-guided data science paradigm (Faghmous et al., 2014), learn how to manage the risks of climate change (Ford et al., 2016), explore soft data sources, e.g., Twitter (Jang et al., 2015), or demonstrate the potential of Systems of Systems (SoS), for instance, the exploration of the structure and relationships across institutions and disciplines of a global Big Earth Data cyber-infrastructure: the Global Earth Observation System of Systems (GEOSS) (Craglia et al., 2017).
Today, it is obvious that sustainability science is intertwined with data science, however, with the support of the business model of the circular economy (Jabbour et al., 2019), the complexity of the problem repository has further increased, so there is an urgent need to include data and analysis methods in the framework, whereas research results from different fields can be used in other fields. Furthermore, trends in climate and sustainability science are driving models toward higher resolution, greater complexity, and larger ensembles, which calls for multidisciplinary approaches in climate computational sciences (Balaji, 2015). This research provides a higher-level overview of the interconnectedness of disciplines, systems, data, and tools related to climate change, exploring further focal points concerning the need a deeper level of integration, because a disconnection between important industry initiatives and scientific research is still experienced (Nobre and Tavares, 2017). We propose to solve these integration tasks and disconnections by the System of Systems thinking. This overview seeks to address these shortcomings. Information sources (data, news, scientific databases) can be linked, drawing attention to the future importance of open linked data. The present research draws attention to System of Systems (SoS) thinking, as the drivers and effects of climate change, as well as resilience and adaptation, can only be achieved through the timely recognition and exploitation of synergies and trade-offs between the new research directions.
The research methodology outlines firstly, the identification of sustainability science problems in section 2, which revealed the connected issues and tasks as well as the requirements needed to succeed. It ensured that sustainable operation of nature and society demands the approach of systems of system along with the integration of Big Data applications into climate-related scientific, societal, and political researches. This is in line with the growing risk of uncertainty zones highlighted in the planetary boundary framework (Steffen et al., 2015). Then, the existing applications of the related data analysis in the field was explored. For a deeper and narrowed insight, literature review was based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) method, which contributes to the exploration and evaluation of related articles. The search has a clear and narrowed focus on the multidisciplinary nature of the issue, therefore the generic evaluation is not in purpose. Fifty-seven review articles were individually analyzed to identify focus areas and research gaps in the Big Data applications in climate change researches. Systematic meta-analysis was used to identify how data are clustering into diverse focus ares and to extract valuable structural information. The cooccurrences of keywords were examined with regard to 442 articles describing the relationship between climate change and Big Data.
In the following sections, the aforementioned research questions are being unfolded and answered through revealing the increasing importance of the System of Systems theorem. Synergies between new research directions and disciplines must be explored to determine the drivers and effects of climate issues as well as provide an efficient strategic adaptation and mitigation plan that also consider socio-environmental factors. Our proposed SoS framework is a response to this integrated knowledge management , as a first step toward climate computing.
In section 2, the sustainability science theorem questions are answered considering the essential need of data science applications. In section 3, heterogeneous data management as well as Big Data tools and techniques are emphasized.
The systematic review of climate change analyses can be found in section 4, which includes the connections between Big Data and climate in section 4.1 as well as a critical summary of different methods in section 4.2. The social aspects are highlighted in section 4.3. Based on the overview, from the new climate-related research findings, a specific SoS framework is presented in section 4.4 and the intertwining of the SoS and SDGs are discussed in section 5, where the suggestions for future research directions and applications are summarized.

PROBLEMS OF SUSTAINABILITY SCIENCE
The complexity of climate issues requires adaptive strategies for public policy (Di Gregorio et al., 2019), actions to incite social behavior (Xie B. et al., 2019), and the development of regulatory and market-simulating responses to economic life (Wright and Nyberg, 2017). To meet this complex societal need, research has focused on understanding the causes of climate change (Hegerl et al., 2019), the development of predictive models (Du et al., 2019), and mitigation solutions (Gomez-Zavaglia et al., 2020), as well as the exploration of opportunities to shape social attitudes (Iturriza et al., 2020).
An interdisciplinary approach is essential in terms of the identification of almost every climate-related problem and development of their solutions. This interdisciplinary perspective has formed sustainability science theorem to gain a comprehensive understanding of the interrelationship between environment and society (Kates et al., 2001). This theory focuses on transdisciplinary questions, which can only be answered by applying of data science tools.
• How can the dynamic relationship between nature and society be described and analyzed? Systems Dynamics Modeling tends to be a commonly used tool when describing and analysing the dynamic interrelation of environment, economy, and society (Honti and Abonyi, 2019). This concept is clearly characterized by the World3 model, which describes the relationship between population, industrial growth, food production, and ecosystem constraints over time for the Club of Rome in the book entitled "The Limits to Growth" (Meadows et al., 1972). The exploration of the relationship between the state variables of the model requires targeted interdisciplinary research. The tools of data science can render this research more efficient with the automated generation and validation of relationship hypotheses (Sebestyén et al., 2019), as data-based models beyond the exploration of probabilistic correlations can provide information on causation (Dörgő et al., 2018). One of the most significant tasks for the more in depth analysis of climate effects is the integration and joint management of heterogeneous data and information. The proof of this potential approach is a case study that interlinks socioeconomic variables to explore the effect of the climate on global food production systems (Fischer et al., 2005).
• How can delays, inertia, and uncertainty in models be handled?
To quantify the impact of uncertainties inherent in climate variables, the evaluation of Representative Concentration Pathways RCP 4.5 and RCP 8.5 CMIP models developed to forecast climate change (Taylor et al., 2012;Eyring et al., 2016), by using Monte Carlo simulations can be suitable (Mallick et al., 2018). The most important task ahead is the integrated development of targeted solutions for designing, evaluating and integrating simulation studies to quantify uncertainty and risk in the light of environmental and social data (Climate Change, 2014). For this reason DKRZ carried out extensive simulations with the Earth system model MPI-ESM with respect to the CMIP5 project and the IPCC AR5, presenting a selection of visualizations for different key climate variables and for the different scenarios (Klimarechenzentrum, 2021).
• How can the features concerning the vulnerability of socioenvironmental systems be explored?
The conceptual framework of vulnerability is grounded by the Intergovernmental Panel on Climate Change (IPCC). The complex impact chains of vulnerability demand the identification and integration of non-climatic factors into climate models, in addition the development of models describing adaptability as well as the estimation of expected damage (Füssel and Klein, 2006). It is believed that the toolbox of network science will play an increasing role in evaluating vulnerability as the significance of state variables and their relationships can be directly qualified regarding their role in dynamic models (Leitold et al., 2020).
• How can the increasing risk be measured? What scientifically based "boundaries" and "limits" can be defined?
The purpose of the planetary boundaries concept is to define operating conditions and to account for adverse or catastrophic abrupt environmental changes in the crossing of one or more planetary boundaries (Rockström et al., 2009). Quantifying the risks of climate-induced changes using climate models shows that the risks will increase over the next 200 years, even if the composition of the atmosphere remains constant (Scholze et al., 2006). The socio-cultural domain plays a crucial role in terms of risk perception (Van der Linden, 2015), therefore, the integration of variables describing socio-cultural factors into the models can be particularly important. Analyses are essential to explore how human-induced perturbations affect the delicate balance of the ecosystem in addition to determining where the limits and boundaries are, the crossing of which would pose an unacceptable level of risk (Steffen et al., 2015). The integrated application of simulation tools and machine learning toolbox can efficiently explore these boundaries (Lenton, 2011).
• What support/motivation systems can be developed-rules, norms, scientific information-to increase the capacity and sustainability of society? What signs and guidelines are needed to put society on a sustainable path? How can today's isolated research, analyses, and decision support systems be integrated more efficiently?
The integration and targeted systematization of scientific knowledge is needed to address the long-term causes of climate change and reduce its effects (Pauliuk, 2020). Research concerning sustainability and socio-ecological systems has been partly interlinked to foster sustainability transformation in a transdisciplinary manner. For bridging the gap between science and society, the involvement of citizens in framing research and processes may be a solution as "through their relationship to a place, bounded often as a social-ecological construct, stakeholders, and people at large play an essential role in sustainability transformation research." Furthermore, the involvement of external parties can support research into socio-ecological systems and sustainability science (Horcea-Milcu et al., 2020). Methods of the co-production of knowledge, e.g., triangulation, the Multiple Evidence Based approach and scenario building, by learning about crossborder engagement, help to ensure that transdisciplinarity is not only a precursor of integration (Klenk and Meehan, 2015).
To follow the aforementioned path toward sustainable dynamics of nature and society, the data science toolbox and models must be integrated into climate change-related scientific and societal research as well as political agenda. In the following, the Big Data tools and management are interpreted with a specific focus on their role in climate change and we build a System of Systems (climate computing) framework from the various applications.

DATA ANALYSIS TASKS OF CLIMATE CHANGE RESEARCHES
The term Big Data has spread due to new technologies and innovations that have emerged over the past decade (Chen and Chiang, 2012) given the demand for the analysis of large amounts of and rapidly generated diverse data, therefore, collection and processing takes place at a high speed, which is difficult to implement with calcareous analytical tools (Constantiou and Kallinikos, 2015). The explosive leap in the amount of data has also infiltrated health, finance, and education (Benjelloun et al., 2015). With regard to the global economy, Big Data is key to understanding and increasing performance (Maria et al., 2015). Big Data is also gaining ground in the field of sustainability, so it can be used to improve social and environmental sustainability in supply chains (Dubey et al., 2019), augment the informational landscape of smart sustainable cities (Bibri, 2018), and improve the allocation and utilization of natural resources (Song et al., 2017) as well as supply chain sustainability (Hazen et al., 2016). Big and open data from "smart" government to transformational government can facilitate collaboration. It is possible to introduce real-time solutions into agriculture, health, transport, and other challenges (Bertot et al., 2014). The Big Data approach can be the most effective tool to improve mutual governmental and civic understanding, thus embodying the principles of digital governance as the most viable public management model (Clarke and Margetts, 2014). There is a need to collect large amounts of data that can be used to model and test different scenarios to sustainably transform energy production and consumption, improve food and water security, as well as eradicate poverty. Initiatives such as the Intergovernmental Panel on Climate Change and the Global Ocean Observing System can fill gaps in scientific, technical and socio-economic data (Gijzen, 2013). The analysis of sustainable business performance forecasts through the analysis of Big Data in the context of developing countries shows that "Management and leadership style" and "Government policy" are the most significant factors at present (Raut et al., 2019).
The process of data mining is shown in Figure 1. Big Data is a rapidly generated amount of information from a variety of sources and in a different format. Data analysis is the examination and transformation of raw data into interpretable information, while data science is a multidisciplinary field of various analyses, programming tools, and algorithms, forecasting analysis statistics as well as machine learning that aims to recognize and extract patterns in raw data. Thus, Big Data primarily looks at ways to analyse, systematically extract or otherwise handle data from datasets that are too large or complex to handle with traditional data processing application software that requires significant scaling (multiple nodes) to process efficiently. In other words, Big Data can be defined by the 5V key characteristics, i.e., volume, velocity, variety, veracity, and value (Laney, 2001).
The storage, sustainability, and analysis of massive content is a challenge that the current state of algorithms and systems cannot handle (Trifu and Ivan, 2014) in an integrated manner, therefore the synergies of the different sources are not sufficiently exploited. The purpose of using Big Data is to provide data management and analysis tools for the ever-increasing amount of data (Anuradha et al., 2015). As is shown in Figure 2, data analysis can be divided into four general categories (Erl et al., 2016). In the environments of Big Data analytics, data analytics involves the use of highly scalable distributed frameworks and technologies to extract meaningful information from large amounts of raw data that requires the use of different data analysis methods (Rajaraman, 2016).
Big Data is usually associated with two technologies, cloud computing and the Internet of Things (IoT) (Honti and Abonyi, 2019). Cloud computing accelerates unlimited data storage, parallel data processing, and analysis (Inukollu et al., 2014).
The key benefits of cloud computing are improved analysis, simplified infrastructure, and cost reduction. IoT offers the ability to connect computing devices, mechanical and digital machines as well as objects and people (Lavin et al., 2015). With the advent of the IoT, huge amounts of data can be collected using smart devices connected via the Internet (Suchetha et al., 2015).
The applicability of Big Data techniques is also significantly enhanced by the novel tools that support data collection and integration. The interoperability of the systems can be improved by data warehouses and the related ETL (extract, transform, load) functionalities that can also be used to gather information from multiple models and data sources. The benefit of these structure are demonstrated in the EC4MACS (European Consortium for Modeling of Air Pollution and Climate Strategies) data warehouse that establishes a suite of modeling tools for a comprehensive integrated assessment of the effectiveness of emission control strategies for air pollutants and greenhouse gases. In this system the integrated data are loaded into the GAINS (Greenhouse gas-Air pollution Interactions and Synergies) Data Warehouse. This assessment brought together expert knowledge in the fields of energy, transport, agriculture, forestry, land use, atmospheric dispersion, health and vegetation impacts, and it developed a coherent outlook into the future options to reduce atmospheric pollution in Europe (Nguyen et al., 2012).
The integration of different information can also be supported by ontology-based linked data. Ontology Web Language (OWL) models enables the semantic characterization of the different events that can describe the climate change story from multiple perspectives, including scientific, social, political, and technological ones (Pileggi et al., 2020).  Artificial intelligence (AI) and machine learning (ML) are also the key enabler technologies of big data analysis. This paper focuses on the applicability of ML-based models. AI is mainly used to support decision-making, but it also can skilfully fill observational gaps when combined with numerical climate model data. An example of this application can be found in the extension of historical temperature measurements used in global climate datasets like HadCRUT4 (Kadow et al., 2020).
Analysis of Big Data combines traditional methods of statistical analysis with computational approaches. Based on the complexity between the variables and the type of results required, data analysis can be a simple data set query or a combination of sophisticated analysis techniques (Al-Shiakhli, 2019). The analysis of Big Data is a synthesis of quantitative and qualitative analyses.
Climate computing combines multidisciplinary researches in regard to climatic, data and system sciences to efficiently capture and analyse climate-related Big Data as well as to support socio-environmental efforts. Underlying this aspect, a complex model of the earth system is continuously developed by DKRZ using supercomputers relying on Big Data, numerical computations, and simulation models to enable scientists to integrate chemical and biological processes, as well as investigate the interaction of the climate and the socio-economic system (Klimarechenzentrum, 2021).
Exploratory Data Analysis (EDA) techniques are approaches for analysing large data sets. These techniques make the main features clearer by hiding other aspects. Most EDA techniques are graphical in nature, with some non-graphical additions. Some basic EDA tools are histograms, quantile quantile plots (Q-Qplots), scatter plots, box plots, stratification, log transformation, and other summary statistics (Komorowski et al., 2016). Qualitative models can be classified into qualitative causal models and abstraction hierarchies. The causal models can be classified into Digraphs, Fault Trees, and Qualitative Physics. Abstraction hierarchies consist of two important components: structural and functional (Venkatasubramanian et al., 2003).
Data mining is a set of methods that extracts certain information from large and complex databases. Data discovery uses automated, software-based techniques to eliminate randomness and uncover hidden patterns and trends (Fayyad and Simoudis, 1997). The classification of data mining techniques is summarized in Table 1 (Zaki and Ho, 2000), including a  (Zaki and Ho, 2000). Classification models define the similarity structure of the variables and are partitioned into groups (classes) (Aggarwal, 2015). In Big Data-based climate studies, classification models and techniques are greatly utilized. Two streams with different hydroclimatologies were studied in the United States using an artificial neural network (ANN). The analysis identified a large effect on a variety of factors such as average runoff, flow variability, flood frequency and baseline flow stability (Poff et al., 1996). To overcome the great uncertainties inherent in climate models, an alternative neural network-based climate model has been developed that increases the efficiency of large climate model sets by at least one order of magnitude. Based on this, it can be concluded that heating exceeds the surface heating range estimated by the IPCC for almost half of the members of the ensemble (Knutti et al., 2003). This neural network is an effective tool for dealing with such difficult and challenging problems, moreover, has been widely used to explore the mechanisms of climate change and predict trends is climate change that take full advantage of the unknown information hidden in climate data, however, it cannot decipher it.
General Circulation Models (GCMs)-the most advanced tools for estimating future climate change scenarios-operate on a coarse scale, which can be downscaled by support vector machine (SVM) approaches, training meteorological subdivisions (MSDs) and developing a downscaling model (DM) that has been shown to be better than conventional downscaling using multilayered regenerative artificial neural networks (Tripathi et al., 2006). The utilization of solar energy is evolving dynamically in connection with SDG 7, but power plant performance may fluctuate due to the diversity of meteorological conditions, which can be compensated by satellite imagery and SVM learning scheme to predict the motion vector of clouds (Jang et al., 2016). Object-based image analysis (OBIA) and support vector machine (SVM) combined with a decisiontree classification are suitable for mapping mangrove areas that was impossible by traditional remote sensing methods other than rough spatial resolution (Heumann, 2011). Decision tree algorithms consistently outperform maximum likelihood and linear discriminant function classifiers in terms of land cover mapping problems classification accuracy (Friedl and Brodley, 1997). Using a weather-generating model,which allows the nearest neighbor to be re-sampled by disturbing historical data, it is possible to create a set of climatic scenarios based on probable climatic scenarios to produce meteorological data that can be used to assess the vulnerability of the river basin to extreme events (Sharif and Burn, 2006). The ability of the Bayesian Network (BN) to predict long-term changes in the shoreline associated with rises in sea level and quantitatively estimate forecast uncertainty renders it suitable for research into the effects of climate change (Gutierrez et al., 2011). It has been used successfully to assess the effects of climate change disturbances on the structure of coral reefs (Franco et al., 2016) and in terms of belief updating concerning the reality of climate change in response to presenting information concerning the scientific consensus on anthropogenic global warming (AGW) (Cook and Lewandowsky, 2016). Using genetic algorithm and occurrence data from museum specimens, ecological niche models were developed for 1,870 species occurring in Mexico and projected onto two climatic surfaces modeled for 2055 (Peterson et al., 2002). A multi-objective genetic algorithm for optimizing water distribution systems (WDS) was used as a discovery tool to examine trade-offs between traditional economic goals and minimize greenhouse gas emissions (Wu et al., 2010). The European territory was subdivided into similar regions of predicted climate change based on simulations of total daily precipitation as well as recent (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005) and longterm future (2081-2100) temperatures using K-mean cluster analysis (Carvalho et al., 2016). An automated procedure based on a cluster initialization algorithm is proposed and applied to changes in the 27 climatic extremes. The proposed method requires, on average, 40% fewer scenarios to meet the 90% threshold than k-means clustering (Cannon, 2015).
Clustering-based analyses are widely accepted data mining techniques, however, improvements in terms of time and cost savings are constantly required due to the management of an increasing amount of data (Shirkhorshidi et al., 2014). Regarding its usage in climatic analyses, a clustering-based spatio-temporal analysis framework of atmospheric data was developed to support both governmental and industrial decisionmaking processes (Cuzzocrea et al., 2019). To assess erosivity risk, clustering and classification analyses were applied on the national level in Turkey, moreover, an artificial neural network-based prediction was also made. The results identified an increasing risk of soil erosion in the southern and western regions of Turkey, which demands erosion control practices (Aslan et al., 2019). Research has been conducted to regionalize Europe according to similar surface temperatures based on data between 1986 and 2005. The differences between long-term predictive data (CMIP5) and historical data were analyzed with k-means clustering analyses to determine grid points (Carvalho et al., 2016). A fuzzy c-means approach regionalization was determined in western India for the analysis of meteorological drought homogeneous regions to provide effective support for water resources planning and management during droughts (Goyal and Sharma, 2016). Clustering techniques can support simulation and predict models by grouping large-scale data. "Wind energy production is expected to be affected by shifts in wind patterns that will accompany climate change." In California, wind patterns have been clustered using model simulations from the variableresolution Community Earth System Model (VR-CESM) and analyzed according to the change in the frequency of clusters and changes in winds within clusters. The changes in capacity factor have significant influence with regard to energy generation (Wang M.et al., 2020).
Regression analysis sought to reveal functional relationships between variables that can further support predictive and forecasting models. Urbanization tends to have a significant impact on climate change, as underlined by an Australian study which determined that changes in land use and vegetation as a result of shifts in urbanization that affect the local climate and water cycle as well as its impacts are considered to be local specific (Maheshwari et al., 2020). Multiple regressionbased analysis has been used to determine flood risk in urban catchments by combining multiple linear regression, multiple nonlinear regression and multiple binary logistics regression. This framework sought to support action plans concerning drainage management and maximize the impacts of flood susceptibility strategic implementations (Jato-Espino et al., 2018). Regarding water management, the influence of climate change on the hydrological cycle in the Yangtze River Basin has been analyzed using a regression analysis model and geographic information system (Keliang, 2019). Soil plays a significant role in carbon sequestration, therefore, moderate undesired climatic effects. A model has been designed regarding the top 25 cm of topsoil of the Sierra Morena (Red Natura 2000) area to determine the relationship between independent variables and soil organic carbon (SOC), moreover, by the use of multiple linear regression analysis examined the effects of these variables on SOC content. The results indicated that "SOC in a future scenario of climate change depends on average temperature of coldest quarter (41.9%), average temperature of warmest quarter (34.5%), annual precipitation (22.2%), and annual average temperature (1.3%)." The comparison between the current (2016) and future situations reflects a reduction of 35.4% SOC content and a trend in northward migration (Olaya-Abril et al., 2017).
Frequent itemset/pattern mining is a commonly used technique to extract knowledge from databases. The handling of an increasing amount of heterogeneous data is becoming ever more difficult, therefore, "an efficient algorithm is required to mine the hidden patterns of the frequent itemsets within a shorter run time and with less memory consumption while the volume of data increases over the time period" (Chee et al., 2019). Association rule mining (ARM) models have been built for atmospheric environment monitoring based on the Apriori algorithm and D-S theory/ER algorithm. These techniques provide both technical and theoretical support to prevent as well as manage air pollution . Association rule mining has also been used in terms of monitoring weather behavioral data to develop a prediction model for climate variability (Rashid et al., 2017). Furthermore, climate variability has an impact on agriculture, which demands a greater understanding with regard to the impact of the climate on crop production and food security. Therefore, the impact of seasonal rainfall on rice crop yield was determined based on ARM techniques (Gandhi and Armstrong, 2016). For the understanding of wind conditions, multidimensional sequential pattern mining is used that can define which pattern is suitable for wind energy (by taking into consideration the factors of space, time, and height). According to a study on the Netherlands, 68.97% of the country covered by a suitable wind pattern (at 128 m) and already has wind turbines installed (Yusof et al., 2017). A spatio-temporal pattern-based sequence classification framework was built to estimate the extent of deforestation. This approach was applied on a Tunisian case study that took into consideration 15 years of satellite images and historical wildfire GIS data (Toujani et al., 2020).
Visualization methods sought to explore the interconnections between data by simplifying multivariate data. Self-organizing map neural network (SOMN) method has been used to analyse anomalous atmospheric circulation patterns in China with regard to surface temperature anomalies between 1979 and 2017 (Gao et al., 2019). This method is greatly used for mapping changes, e.g., regarding urban flood hazards (Rahmati et al., 2019). A study on the city of Amol in Iran was conducted and according to the aforementioned model of urban flood hazard mapping, 23% of the land area of the city is expected to high or very high levels of flood risk, which demands efficient flood risk management. SOMN and grid cells method were applied to determine changes in spatio-temporal land cover in Inner Mongolia between 2004 and 2014 . The Principal Component Analysis (PCA) technique has been used to assess the vulnerability of the coastal region of Bangladesh while taking into consideration the IPCC framework. The study used 31 indicators (24 socio-economic, 7 natural). PCA was applied and determined seven eigenvectors [Demographic Vulnerability (PC1), Economic Vulnerability (PC2), Agricultural Vulnerability (PC3), Water Vulnerability (PC4), Health Vulnerability (PC5), Climate Vulnerability (PC6), and Infrastructural Vulnerability (PC7)] that take into consideration climate change scenarios from 2013 to 2050 (Uddin et al., 2019). PCA has also been used to build the composite drought vulnerability index (Balaganesh et al., 2020).

Overview of Big Data-Based Climate Change Analysis
The significance of Big Data in climate-related studies is greatly recognized and its techniques are widely used to observe and monitor changes on a global scale. It facilitates understanding and forecasting to support adaptive decision-making as well as optimize models and structures (Hassani et al., 2019).
Review articles can provide a better organized structure of previous studies, so the major focus areas are determined with regard to previous review articles concerning the connection between climate change and Big Data. The major objective is to reveal how diverse disciplines appears in the related researches, therefore narrowing when and how Big Data applications and the relation with data science are appeared in climate studies.
A comprehensive overview was conducted based on the Scopus database. Fifty-seven articles were retrieved from the following search: [TITLE-ABS-KEY("climate change") AND TITLE-ABS-KEY("Big Data")] AND [TITLE-ABS-KEY("overview") OR TITLE-ABS-KEY("review")].
Articles were reviewed and selected individually for the final sample. Table 2 shows the number of articles selected and excluded.
The 47 articles of the final sample are shown in Tables 3-5, where a straightforward description and focus area of the research are indicated as well as categorized accordingly. It is notable that mostly specific climate issues are observed (e.g., decarbonization of energy or land ecosystem) and their potential with regard to Big Data determined. The two most affected categories are agriculture and studies of sustainable cities and communities. This is a good illustration of how intertwined research on climate action is with sustainable development goals. The quality and safety of agricultural products can be assured through solutions provided by the Internet of Things (IoT) and cloud computing (Marcu et al., 2019). Remote sensing and Artificial Intelligence technologies enables to integrate Big Data into predictive and prescriptive management tools, to improve e.g., the resilience of agricultural systems (Jung et al., 2020). Big Data virtualization in the field of agriculture enables physical objects to be virtualized, e.g., sensors and devices used for defining soil moisture, water flows, or salinity, where these objects can provide diverse meaningful information in each phase of a data chain to support decision-making and information handling (Mathivanan and Jayagopal, 2019). Furthermore, Big Data techniques are utilized in terms of plant breeding (Taranto et al., 2018), crop ideotypes for food security (Christensen et al., 2018), or in precision agriculture framework (Demestichas et al., 2020). Climate Smart Agriculture framework aims to enhance the capacity of the agricultural systems to support food security, supporting adaptation, and mitigation into sustainable agriculture development through latest technologies as IoT, AI, geo-informatics, and Big Data analytics (Gulzar et al., 2020). The interdisciplinary and systematic approach of soil use and management to achieve related sustainability goals has also been explored (Hou et al., 2020).
Alignment with regard to the focus area of sustainable cities and communities with the 11th sustainable development goal (Sustainable cities and communities) has been explored through reviews. Big Data management can enhance the opportunity for organizations to respond to the risk of climate change in time (Seles et al., 2018) as well as offers possibilities to consider sustainable production and lower emission rates. Furthermore, machine learning can be effectively utilized for low-carbon urban planning (Milojevic-Dupont et al., 2020). Outside the field of industry, co-operation, legislation, and environmental agreements are essential to realize a sustainable manufacturing environment (Hämäläinen and Inkinen, 2019). The concept of smart cities seeks to overcome and prevent climate change and issues concerning urbanization (Sharifi, 2019), moreover, smart transportation policies can utilize the advantages of Big Data (De Gennaro et al., 2016). In this smart environment, civil engineers are seen as future risk and uncertainty managers to improve community resilience through smart infrastructure programs (Berglund et al., 2020).
Climate resilience studies assess how to prepare for, recover from and adopt to climate-related risks (Center for Climate and Energy Solutions, 2019). Big Data seeks to support these activities by providing a large volume, variety, and quality data to reveal patterns and enables data democratization (Faghmous et al., 2014). Therefore, Big Data approach can serve as a source of key information for decision-makers in terms of creating and adapting appropriate strategies, determining current, and upcoming issues, as well as identifying stages of recovery for taking actions in time (Sarker et al., 2020). News media can serve as a near-real-time geolocated information, which can support the understanding of social movements and earlywarning systems. "Combining news media with social and biophysical data is important to verify results and limit biases in analysis" (Buckingham et al., 2020). One of the issues concerning urban environments is energy efficiency and carbon emissions, for which net zero energy movements seek to bring about a solution as well as the application of a resilience ecological framework for net zero energy research (Hu and Pavao-Zuckerman, 2019). Furthermore, Big Data techniques with regard to machine learning enable the attitude of people toward and recognition of environmental changes to be determined (Park et al., 2020). Big Data and machine learning approaches are vital in comprehensively merging heterogeneous genomic and ecological datasets (Cortés et al., 2020).
However, review articles have explored the potential for utilizing Big Data techniques in diverse areas, moreover, comprehensive overviews about climate change are becoming less of a focus. Even though data-intensive research applications may seems to be unbalanced among disciplines (Hassani et al., 2019), the dynamism and complexity of climate issues must not be neglected. This complexity brings about an interdisciplinary approach and the intertwining of diverse disciplines, to which the System of Systems concept (climate computing) is the urgent answer.

Meta-Analysis With Regard to the Methods of Climate-Related Analyses
Co-word analysis examines the relationships between keywords to reveal the structure and development of methodologies or applications. The relationships between keywords in research papers "contains valuable information about knowledge structure of the field, its relevant concepts, and their connections" Lozano et al. (2019). It is our aim to determine diverse focus areas, methodologies and techniques regarding Big Data-driven climate change analyses and harmonize these to allow better utilization of the achieved field-specific results.
The Scopus database was used to identify the corresponding papers using the following search: [TITLE-ABS-KEY("climate change") AND TITLE-ABS-KEY("Big Data")]. As a result 442 articles were retrieved and the co-occurrence of their keywords analyzed using VOSviewer. The time period in which the

Soil
The article provides a comprehensive overview about soil in connection with sustainability issues-several SDGs.
The overview highlights that interdisciplinary studies which incorporate such advances may lead to the innovative sustainable use of soil and management strategies that seek to optimize soil health and achieving the SDGs.

Hou et al., 2020
Land ecosystem The article analyses the developmental characteristics and trends of research into global land ecosystem services using the Bibliometrix software package.
The overview highlights the diverse facets of land ecosystem services and the practical application of land ecosystem services. Xie et al., 2020 Virtualization, soil, water, crops, plants The article provides a comprehensive review of Big Data virtualization in the agricultural domain.
The overview highlights the potential in information a the virtual object as it has large volume of data which helps data analysis or to create application services like decision-making, problem notification, and information handling.

Mathivanan and Jayagopal, 2019
Crop production, food security The article examines modeling strategies for the development of crop ideotypes and scientific visualization technologies that have led to discoveries in "Big Data" analysis.
The overview highlights that integrative modeling and advanced scientific visualization may help overcome challenges in agricultural and nutritional data as large-scale and multidimensional data become available in these fields.

Christensen et al., 2018
Soil The article explores trends in the development of pedotransfer around the world and considers trends between data and methods to build pedotransfer relationships.
The overview highlights that the physics-based interpretation of pedotransfer functions (PTFs) is expected to be in demand.

Pachepsky et al., 2015
Plants, biotechnology The article describes technologies concerning plant breeding and provides examples of their application to breed climate-resilient cultivars.
The overview highlights that technological improvements in phenotypic and genotypic analysis, as well as the biotechnological and digital revolution, will reduce the breeding cycle in a costeffective manner.

Taranto et al., 2018
IoT, cloud technology, Smart farming The article explores the potential in IoT technology with regard to the agricultural sector-plants are sensitive to changes, in climate change context and monitoring, IoT can bring about dramatic progress.
The overview can be used as a basic tool for choosing an IoT platform solution for future telemonitoring systems.

Marcu et al., 2019
Smart farming, crops The article presents a review of some areas involved in the definition of an alert system for diseases and pests in terms of Smart Farming, based on machine learning and graph similarity.
The article proposes an architecture for coffee disease and pest detection.

Lasso and Corrales, 2017
Food safety The article presents a review of the likely consequences of climate change for foodborne pathogens and associated human illnesses in higher-income countries.
The overview highlights that climate change may have important effects of foodborne illnesses.

Lake and Barker, 2018
Agricultural systems, AI, remote sensing This article focuses on the use of recent technological advances in remote sensing and AI to improve the resilience of agricultural systems.
The review presents a unique opportunity for the development of prescriptive tools needed to address the next decade's agricultural and human nutrition challenges.

Jung et al., 2020
Smart farming The article conducts a literature review of prominent ICT solutions, focusing on their role in supporting different phases of the lifecycle of precision agriculture-related data.
The article also introduce a developed data lifecycle model as part of a novel categorization approach for the analyzed solutions.

Demestichas et al., 2020
Food safety The article discuss some of the forefront issues in food value chains with a focus on using technology.
The article highlights that the cultural awareness and social innovation to prevent food waste and therefore improve food security and sustainability will also prove to further complexities.

Chapman et al., 2020
Smart agriculture This article presents an analytical review of smart agriculture (SA) and climate smart agriculture (CSA) along with a thorough CSA architectural taxonomy.
The article surveys CSA and devise its architectural taxonomy in terms of technological components of SA as well as climate change mitigation to ensure food security, environment sustainability and lesser CO2 emissions.

Cleaner production
The article provides an overview of the scope and trends in venture capital-funded innovation in Cleantech.
The overview explores trends in venture capital-funded innovation in Cleantech, the broad scope of the basic science and technology, and the impacts of Cleantech that affect global climate change.
Huang, 2015 Energy The article provides a comprehensive review that assesses the current as well as the potential impact of digital technologies within cyber-physical systems (CPS) on the decarbonization of energy systems.
The overview highlights advances in CPS and Artificial Intelligence (AI) with regard to real-world adaptation in energy systems.

Focus area Description Usage References
Satellites, remote sensing The article explores the potential of Big Data with regard to implementing a proper strategy against the effects of climate change as well as enhancing the resilience of people in the light of the adverse effects of climate change.
The overview enables policymakers and related stakeholders to implement appropriate adaptation strategies for enhancing the resilience of the people from the affected areas.

Sarker et al., 2020
Machine learning The article explores the attitude of people toward climate change issues based on news analysis.
The article highlights the potential in using this method for monitoring functions, recognition and that detection of opinion. The overview suggests that the management of data resources should be strengthened and the construction of the global change Earth observation data-sharing platform for the realization of the effective sharing of data resources accelerated.

Guo et al., 2015
Energy, climate resilience The article provides an initial step in terms of understanding the research activities of the past five decades in these two areas (NZE and resilience) as well as their connection to their ecological roots.
The overview highlights the major difference between the net zero movement and resilience theory in terms of the urban environment and their respective relations to their ecological origins.

Hu and Pavao-Zuckerman, 2019
Water The article explores some important impacts on the development of hydrology and water resources in Australia.
The overview highlights that the value and distribution of water resources will change.

Fitzharris, 2016
Forestry The article discuss predictive genomic approaches that promise increasing adaptive selection accuracy and shortening generation intervals.
The article discuss how trees' phylogeographic history may affect the adaptive relevant genetic variation available for adaptation to environmental change. Encouraging "Big Data" approaches (machine learning-ML) capable of comprehensively merging heterogeneous genomic and ecological datasets. Cortés et al., 2020 papers were written was between the years 2012 and 2020. In Figure 3, seven clusters are indicated by a diverse range of colors that overarch topics related to climate change and application methods of Big Data. Each cluster refers to a focus area including its attributes of interrelationships as well as methodologies and techniques applied in the field.
The "Red" cluster denotes the connections between Big Data technologies and methods applied for optimization procedures, measures the impact of climate change and resilience as well as makes predictions. Technologies are considered, e.g., artificial intelligence, learning algorithms such as machine learning and deep learning, data analytics, neural networks, and cluster computing. Neural networks are used to analyse climate change, weather prediction, and visualization (Buszta and Mazurkiewicz, 2015), while machine learning techniques are used for intelligent recognition (Demertzis and Iliadis, 2016) and to define the impact of climate change and resilience (Rolnick et al., 2019). In addition, they are used to predict epidemics and diseases in both social (Rees et al., 2019) and environmental contexts e.g., in the case of crops (Fenu and Malloci, 2019), coffee disease and pest (Lasso and Corrales, 2017), or pedotransfer functions (Benke et al., 2020). Clustering techniques on cloud computing infrastructure have been applied, e.g., to map changes in glaciers (Ayma et al., 2019). A novel machine learning approach has been developed by the U.S. Department of Energy's National Renewable Energy Laboratory using adversarial training in climate forecasting, in which the model provides a "physicsinformed variation to the super resolution generative adversarial network (SRGAN) model, which extends proven performance The overview highlights that technological improvements in phenotypic and genotypic analyses, as well as the biotechnological and digital revolution, will reduce the breeding cycle in a cost-effective manner. Taranto et al., 2018 IoT, cloud technology, Smart farming The article explores the potential of IoT technology in the agricul-tural sector-plants are sensitive to changes in terms of climate change and monitoring, IoT can bring about dramatic progress.
The overview can be used as a basic tool for choosing an IoT platform solution for future telemonitoring systems.

Marcu et al., 2019
Smart farming, Crops The article presents a review of some areas involved in the definition of an alert system for diseases and pests in terms of Smart Farming, based on machine learning and graph similarity.
The article proposes an architecture for coffee disease and pest detection.

Lasso and Corrales, 2017
Water, IoT The article provides a review of the application of the Internet of Things in the field of marine environment monitoring.
The overview highlights that Big Data analytics can be used not only as feedback for agencies and control center of marine environment but also for autonomous vessels and remotely developed devices in order to take real-time actions.

Xu et al., 2019
Agricultural systems, AI, remote sensing This article focuses on the use of recent technological advances in remote sensing and AI to improve the resilience of agricultural systems.
The review presents a unique opportunity for the development of prescriptive tools needed to address the next decade's agricultural and human nutrition challenges.

Jung et al., 2020
Remote sensing, urban development, ML The article show that the emergence of Big Data and machine learning methods enables climate solution research to overcome generic recommendations and provide policy solutions at urban, street, building and household scale, adapted to specific contexts, but scalable to global mitigation potentials.
The article suggests a meta-algorithmic architecture and framework for using machine learning to optimize urban planning for accelerating, improving and transforming urban infrastructure provision.

Climate models
The article provides a critical overview and synthesis of issues related to climate models, data sets, and impact assessment methods pertaining to islands which can benefit decision-makers and other end users of climate data in island communities.
The overview explores challenges of islandness in terms of top-down, model-led climate impact assessment and bottom-up, vulnerability-led approaches.

Foley, 2018
Risk management, water, energy, food safety The article examines the challenge facing risk assessment posed by the transmission of climate risk.
The overview aims to support future national risk assessments, ensuring that they adequately account for the transmission mechanisms of climate risk.

Challinor et al., 2018
Water The article explores some important impacts on the development of hydrology and water resources in Australia.
The overview highlights that the value and distribution of water resources will change.

Fitzharris, 2016
Food safety The article presents a review of the likely impacts of climate change for foodborne pathogens and associated human illnesses in higher-income countries.
The overview highlights that climate change may have important effects on foodborne illnesses. Lake and Barker, 2018 (Continued) Frontiers in Environmental Science | www.frontiersin.org The article highlights that combining news media data, such as GDELT, with other social and biophysical data sources is an important method for verifying results and limiting biases in data collection and analysis.

Focus area Description Usage References
Machine learning, crowdsourcing, data fusion, cluster analysis The article provides an overview of techniques and approaches with regard to climate studies.
The overview provides brief knowledge of a few strategies in terms of suppor-ting Big Data administration and investigation in the domain of geoscience for climate studies. Radhika et al., 2016 Water The article presents the advances in machine learning and deep learning through novel classification methods.
The overview outlines present state-of-the-art machine-learning and deep-learning methods used to model and identify application areas.

Ardabili et al., 2019
Water, weather, air quality, Hazard management The article provides a review of crowdsourcing-related papers in seven domains: weather, precipitation, air pollution, geography, ecology, surface water and natural hazard management.
The overview outlines knowledge development in terms of crowdsourcing within the specific domain of geophysics as well as similarities and differences.

Zheng et al., 2018
Plants The article reviews phenology models as an important component of earth system modeling.
The overview highlights that the mechanistic development of phenological observation is essential.

Tang et al., 2016
Climate models The article explores space-time analytics dealing with spatial processes, examples of space-time concepts and tools to analyse data.
The overview suggests movement-based space-time analytics by addressing processes across multiple levels with constraints of boundary conditions and initial conditions for the processes at the focal level. Yuan and Bothwell, 2013 Remote sensing, urban development, ML The article show that the emergence of Big Data and machine learning methods enables climate solution research to overcome generic recommendations and provide policy solutions at urban, street, building and household scale, adapted to specific contexts, but scalable to global mitigation potentials.
The article suggests a meta-algorithmic architecture and framework for using machine learning to optimize urban planning for accelerating, improving and transforming urban infrastructure provision.

Milojevic-Dupont et al., 2020
Land ecosystem The article provides an overview on Integrated Climate Sensitive Restoration Framework that recognizes the local participation in mapping degraded lands, identification of species for supporting species modeling to better understand climate uncertainty.
The article highlight that the framework potentially helps in sustainable land restoration by transformative changes for achieving UN decade on Ecosystems Restoration (2021-2030), SDGs 15 and addressing the post 2020 Global Biodiversity Framework. Dhyani et al., 2020 on super resolution of natural images to scientific datasets" (Stengel et al., 2019). This breakthrough is capable of saving computational time and data storage, moreover, can provide more accessible high-resolution climate data that can be utilized in a wide range of climate scenarios. These techniques seek to assess risk management in terms of human and environmental health by providing vital information concerning the present conditions and making predictions about the future. Keywords included in the "orange" cluster, mainly describe agriculture-related climate issues and adaptations. IoT technologies, information systems and sensor networks tend to be applied in a field. Big Data increase the heterogeneity "across farms, farmers, climates, crops, soils, natural resources, models, management strategies and outcomes, post production value chain system, and other economic variables of interest" that can boost knowledge with regard to the concept of climate-smart agriculture (Rao, 2018). IoT technologies have been proven to be beneficial in improving efficiency in the complex field of agriculture. Sensors are used to collect vital information about soil, fertilizer, moisture, sunshine, temperature, and geographic information of farmland for monitoring as well as to link to other databases for identifying attributes (Yan-e, 2011). The combination of automation and IoT technologies broad perspectives in smart agriculture, as remote controlled robots to perform tasks, smart and intelligent decision making based on real time data as well as warehouse management (Gondchawar and Kawitkar, 2016). IoT, visualization, water, air, energy, crowdsourcing The article explores the role of civil engineers with regard to conventional and smart infrastructure programmes-managers of risk and uncertaintyas well as considers climate change mitigation The overview incites inventive thinking to develop research agendas and creatively integrate new technologies across infrastructures. Berglund et al., 2020 Smart city The article provides a critical analysis of 34 selected smart city assessment tools to highlight their strengths and weaknesses as well as examine their potential contribution to the evolution of the smart city movement.
The study can be used by interested target groups such as smart city developers, planners, and policy makers to choose tools that best fit their needs.

Sharifi, 2019
Emissions tracking The article illustrates that Big Data is utilized in various industries, and explores a large variety of pollutants.
The overview addresses the need for using and combining data resources, particularly at the industrial level, in order to develop more efficient tools for environmental monitoring and decision-making.

Hämäläinen and Inkinen, 2019
Energy The article builds complex uncertainty models of power demand and the cost of renewable energy generation, as well as proposes an improved IRSP model based on complex uncertainty simulation.
The overview highlights the necessity to look at the development of electricity from the perspective of energy, moreover, additional primary energy limitations will be introduced into the model in the future.

Zheng et al., 2019
Remote sensing, weather, climate model, air quality, machine learning, The article reviews the current state of urban data science in the context of climate change investigates the contribution of urban metabolism studies, remote sensing, Big Data approaches, urban economics, urban climate and weather studies.
The overview highlights that data-based approaches have the potential to upscale urban climate solutions and bring about change on a global scale.

Creutzig et al., 2019
Air quality, energy The article develops a framework for reducing dust emissions and energy consumption on construction sites.
The article highlights that the proposed framework can be used on construction sites to conduct real-time monitoring, evaluation and the minimization of dust emissions and energy consumption.

Hong et al., 2019
Air quality, energy The article explores the application of Big Data in terms of road transport policies in Europe, namely-minimize the environmental impact, handle climate change mitigation and sustainability challenges, as well as maximize system efficiency.
TEMA designed for supporting EU transport policies via Big Data.
De Gennaro et al., 2016 Risk management The article analyses the challenges and opportunities that the climate crisis presents for organizations and how organizations respond to this scenario, while examining the implications of Big Data management.
The overview highlights that Big Data is a key component to understand the opportunities and challenges of the climate crisis and organization responses. Seles et al., 2018 Energy, climate resilience The article provides an initial step in understanding the research activities over the past five decades in these two areas (NZE and resilience) and their connections to their ecological roots.
The overview highlights the major difference between the net zero movement and resilience theory in the urban environment and their respective relations to their ecological origins.

Hu and Pavao-Zuckerman, 2019
Remote sensing, Urban development, ML The article show that the emergence of Big Data and machine learning methods enables climate solution research to overcome generic recommendations and provide policy solutions at urban, street, building and household scale, adapted to specific contexts, but scalable to global mitigation potentials.
The article suggests a meta-algorithmic architecture and framework for using machine learning to optimize urban planning for accelerating, improving and transforming urban infrastructure provision.

Milojevic-Dupont et al., 2020
Water Focus area Description Usage References

Water
The article provides a systematic review of the literature on the ecological models and eutrophication.
The overview aims to improve the level of application with regard to ecological models in the field of water eutrophication and to better serve environmental water science research.

Water
The article explores and compares global wetland-related datasets and suggest a synthetic method for wetland mapping.
The overview suggests that this synthetic method of wetland mapping should be applied. Hu et al., 2017 Water The article explores the development of watershed management, potential uses of new technologies, current issues as well as the future direction of watershed management and research.
The overview highlights the importance of employing integrated watershed management strategies and outlines methods for improving management strategies. Wang et al., 2016 Water The article explores some important impacts on the development of hydrology and water resources in Australia.
The overview highlights that the value and distribution of water resources will change.

Fitzharris, 2016
Water, IoT The article provides a review of the application with regard to the Internet of Things in the field of marine environment monitoring.
The overview highlights that Big Data analytics can be used not only as a source of feedback for marine environmental management agencies and control centers but also for autonomous vessels and remotely developed devices to take real-time actions.

Xu et al., 2019
Water The article reviews the evolution of Managed Aquifer Recharge (MAR) concept, and then captures its current research in terms of MAR tech-nologies, process of the MAR implementation, applications of MAR, as well as common problems and challenges that are associated with MAR.
The article recommends that further studies on MAR should focus on systematic clogging mechanism and prevention, the theory of seepage calculation, theory of infiltration for MAR, purification mechanism, and application of Big Data and artificial intelligence in MAR Zhang et al., 2020 Water The article uses information visualization technology of CiteSpace to present a systematic review of published literature on the application of eco-models to eutrophication from 1968 to 2018.
The article highlights that eco-models range from dimension-models to time-dependent dynamic models and that the recent trend of close coupling between modeling and the acquisition of new types of experimental data (i.e., remote sensing, high-frequency field sensors) provides a higher prediction ability of ecological models.

Hu et al., 2019
Biodiversity Focus area Description Usage References

Biodiversity
The article reviews the current state of lichen conservation in Canada and the United States.
The review highlights the effective usage of Big Data in informing and monitoring species.
Allen et al., 2019 The "purple" cluster represents natural disasters caused by climate change, e.g., floods or deteriorating air quality, and the related risk management. Decision-making processes are supported by data mining techniques and statistical as well as spatial analysis. The frequency of natural disasters in the Philippines increased by 147% from 1980 to 2012 and continues to rise (Garcia and Hernandez, 2017). Big Data through data mining plays a significant role in creating real-time feedback loops on natural disasters to support disaster management in prevention, protection, mitigation processes as well as response and recovery, moreover, in increasing the resilience of citizens (Yang et al., 2017).
"Light blue" clusters climate models that define interactions of the drivers of climate change. Topics like ecology, biodiversity, vulnerability, and the issue of water resources are included. Big Data-based techniques are widely used and the importance of open data must be recognized. Cloud computing and uncertainty analysis tend to support the modeling of life cycles and climatic effects. The open data science approach ensures a transparent and collaborative environment for multi-model climate change data analytics (Fiore et al., 2018). Information about the geographic distribution of greenhouse gas emissions can be useful in terms of high-resolution modeling (Charkovska et al., 2019).
The "green" cluster defines topics with regard to sustainable development, dealing with gas emissions, greenhouse gases, energy efficiency, and environmental policies. Information analytics and environmental technologies as well as green computing seek to minimize hazardous waste while maximizing energy efficiency and recyclability to foster the concept of a circular economy. Data mining, generic algorithms, and neural networks are gradually applied in sustainable consumption research, that enables more accurate and better visualized results . Managing efficient energy use is a commonly discussed issue that takes into consideration the climate change impact analysis with regard to the energy use of campus buildings (Fathi and Srinivasan, 2019), life-cycle assessment of energyconsuming products (Ross and Cheah, 2019) as well as the adaptation of green computing to reduce the carbon footprint of ICT (Airehrour et al., 2019). The "blue" cluster seems to reveal methodologies considered in climatology, urbanization, and adaptive management. Remote sensing and satellite imagery make it possible to collect a large amount of data that supports mapping and is used to make further predictions. Satellite remote sensing quantifies processes and spatio-temporal states of the atmosphere, land, and oceans , moreover enables, for example, climate change and the impact of human activities on cropland productivity to be detected (Yan et al., 2020) and changes in water resources to be mapped (Senay et al., 2017). The monitoring of carbon by satellite observation provides information about greenhouse gases and emissions that can be utilized in estimation processes regarding the investigation of CO 2 (Zhao et al., 2019).
The "yellow" cluster consists of the global climate changerelated data analyses, visualization methods, regression analysis, and time series analysis. Open systems and open sources are gaining ever more attention in this field. A web-based visualization of complex climate data can assure scientists, resource managers, policymakers, and the public to explore climate-balance projections even at the local level (Alder and Hostetler, 2015). The assessment of spatiotemporal data to gain knowledge from it is a complex challenge, however, a welldeveloped visual analytical system can support performance improvement methods and techniques (Li et al., 2013). A high performance query analytical framework that proposes grid transformation can provide a complex climate data observation and model simulation (L et al., 2017). For climate environmental analyses, a 3D visualization simulation of cloud data is gaining attention in the fields of computer graphics and meteorology .
The application of contemporary technologies like Big Data analytics and IoT-based models is sought to gain a knowledge base in any field by collecting and analysing large complex heterogeneous data sets. This enables evidence-based policy making to be encouraged and serves as a decision support tool for risk assessment and resilience adaptation, while forecasting future socio-economic as well as aiding environmental conditions caused by climate-related change. The Big Data researches are important in itself and contribute to the understanding of climate change, but managing their results in an integrated way increases the level of problem extraction and provides new solutions for decision makers.

The Role of Social Sciences in Climate Change Studies
Most articles on climate change belong to the field of environmental science, closely followed by Earth and planetary sciences, then agricultural and biological sciences. Interestingly, the number of articles published in the social sciences precedes the fields of engineering and energy.
The growing amount of information and knowledge renders multidisciplinary analyses covering the whole field of science and the development of such analytical tools indispensable as the knowledge accumulated cannot be directly utilized without systematization and targeted processing.
Climate change issues tend to connect different disciplines as well as research ideas, models, and solutions related to these issues. In the following, significant connection between climate and social sciences is discussed. The Scopus database was used to extract relevant information for meta-analysis.
The search for a connection with social sciences yielded 1,203 documents: [TITLE-ABS-KEY("climate change") AND TITLE-ABS-KEY("social sciences")]. The networks concerning the co-occurrence of keywords referring to the interrelationship between climate change and social sciences is shown in Figure 4.
Based on the intersections presented in Figure 4, seven communities are detected. The red community includes emissions, energy and economic hubs. The yellow community includes habitat-related nodes. The light blue community covers regulators and issues concerning water management, while the purple community summarizes concepts related to "change, " e.g., vulnerability, adaptation, etc. The green community includes interdisciplinary subject areas, while the dark blue one represents political keywords and the orange community describes sustainable mergers.
A complex relationship exists between human and natural processes involving social, political, geographic, and cultural contexts that demands a multidisciplinary concept (Fiske et al., 2018). Environmental changes call for socio-economic transformation to mitigate the effects caused by humans and increase resilience. Changes are observed in a diverse range of areas such as agriculture and food security, air quality, waters, energy consumption, land ecosystem as well as global warming. These issues must be managed through strategic planning and management with a high degree of focus on longterm sustainable operation. Socio-ecological-economic models must integrate social and biophysical information in order to develop sufficient mitigation and adaptation strategies (Sullivan and Huntingford, 2009). The impact of climate change on water resources is critical as it is related to floods, droughts, tidal waves, and humidity. Big Data-based processes are used to determine, for example, soil conditions and humidity (Anton et al., 2019) to estimate energy consumption (Seyedzadeh et al., 2018) or greenhouse gas emissions (Hamrani et al., 2020) that enable optimal processes and interventions to be predicted. Decision support algorithms, models, and databases are used to provide evidence-base for policymaking and legislation (Aragona and De Rosa, 2019) as well as disaster management (Akter and Wamba, 2019). These can be considered at organizational (Kouloukoui et al., 2019), local (Giest, 2017), sub-national (Hsu et al., 2019), national (Iacobuta et al., 2018), or even global levels (Flato et al., 2014).
Socio-environmental sciences are sought to explore the systematic cause-effect relationship following the environmental impact of human induced climate change. By providing heterogeneous data and supportive models, positive changes can be achieved through interdisciplinary data-driven perceptions that contribute to a better understanding of the complex issue, monitor changes, support decision-making, and bring about in-time interventions.

The Importance of the System of Systems Approach
Climate change is one of the most significant global challenges that need to be managed. To resolve any of the climate changerelated challenges, "it is essential to elicit and integrate knowledge across a range of systems, informing the design of solutions that take into account the complex and uncertain nature of the individual systems and their interrelationships" (Little et al., 2019). The system of system (SoS) framework enables to analyse the interdependencies between various systems (e.g., human, information, environmental, and physical systems), therefore provides a clear understanding of the complex nature of the issue (Fan and Mostafavi, 2019). The trends in data science and information technology (Tannahill and Jamshidi, 2014) supports the integration of various disciplines and research outcomes to represent a socio-environmental system holistically inform policy and decision-making processes (Iwanaga et al., 2020) , which can be referred as climate computing.
To highlight the importance of the application of the system of systems approach, the latest Big Data-based works in the field of climate change were reviewed, based on which we identified a SoS framework (Figure 5). In the network of applications, the nodes show the different researches, and the edges represent the relationships of the research results. The BigData applications have been grouped according to sustainable development goals, thus showing the possible scientific contributions with the other fields.
By processing satellite data, the system developed in Semlali and El Amrani (2021) can monitor changes in air quality, which can also be used to monitor agricultural areas (Majidi et al., 2021). Cloud tracking (He et al., 2020) further helps to assess the evolution of air pollution, the reliability of which can be further enhanced with statistical downscaling solutions (Wang Q. et al., 2020). The time-series data (Joshi et al., 2019) extracted from satellite images support long-term forecasts, but the description of cloud motion  can also be used to refine shorter-term analyzes. The use of satellite imagery as a data source in urban planning also helps identify climate-friendly solutions (Milojevic-Dupont et al., 2020). Web-based water management (Mourtzios et al., 2021) can be supported with trends identified from time-series data (Ise et al., 2020), but remotely sensed water flow data also complements the agricultural water management model (Ismail et al., 2020). And if we increase the resolution of the data (Jimenez et al., 2019), we can also understand the causal relationships related to consumption. In terms of infrastructure load, patterns of population movement (Gurram et al., 2019) offer exciting opportunities, but can also be integrated with the condition of buildings (Gouveia and Palma, 2019), which also supports the satisfaction of urban planning tasks (Milojevic-Dupont et al., 2020) at a higher level.
Agricultural satellite imagery applications (Majidi et al., 2021) can be transferred to air quality satellite monitoring (Semlali and El Amrani, 2021), or time-series data (Ise et al., 2020) can be used to plan better agricultural interventions. By implication, satellitebased support plays an important role in modeling agricultural water management (Ismail et al., 2020), but disaster news (Park et al., 2020) also helps provide a deeper understanding of social involvement. In assessing disaster resilience in different areas, (Sasaki et al., 2020) satellite imagery provides feedback on risks that can even be revealed over time (Joshi et al., 2019). Satellitebased results can be supported by on-site special (Lambrinos, 2019) and meteorological (Mabrouki et al., 2021) sensor data, and flood protection of valuable agricultural areas can also be planned with flood models (Avand et al., 2021).
Identifying patterns in time-series data (Ise et al., 2020) helps with research in many other areas, whether it is agricultural water management (Ismail et al., 2020) or marine habitat protection (Coro et al., 2020). It allows (Kubo et al., 2020) forecasting and a better understanding of coastal traffic and increases the reliability of disaster resilience estimation (Sasaki et al., 2020). By extracting time series data (Joshi et al., 2019) from satellite imagery, we can indirectly validate the models by comparing the time series or identify the factors of potato disease (Fenu and Malloci, 2019). In urban developments (Milojevic-Dupont et al., 2020) and in building condition surveys (Gouveia and Palma, 2019) the forecast shows the development of infrastructure expansion and maintenance, to which the probability of flood protection problems (Avand et al., 2021) can also be linked.
Statistical downscaling (Wang Q. et al., 2020) helps to find the external variables of Mourtzios et al. (2021) consumption patterns identified based on remote sensing and is comparable with the results of satellite image-based analyzes (Semlali and El Amrani, 2021). And comparable to other approaches (Jimenez et al., 2019), which strengthens confidence in the models (Qin and Chi, 2020). Better resolution data supports marine habitat protection planning (Coro et al., 2020), risk assessment input (Fenu and Malloci, 2019), but can also be used (Gouveia and Palma, 2019) to analyze building consumption data. The efficiency of downscaling techniques can be increased with the Internet of Things (Lambrinos, 2019) toolbar. The increase of the number of observations allows a more accurate description of local climatic conditions to estimate floods (Avand et al., 2021) and heat island effects, as well as other sustainable urban planning (Milojevic-Dupont et al., 2020) aspects.
Coastal tourism monitoring (Kubo et al., 2020) can be integrated with traffic data (Hu et al., 2020) to optimize traffic management and thereby reduce pollutant emissions. The effect of transport on plant damage can be included (Meineke et al., 2020) as a factor to be analyzed, or we can use it (Gurram et al., 2019) to identify patterns in population movement.
Population movements (Gurram et al., 2019) affect water consumption (Mourtzios et al., 2021), can damage plants (Meineke et al., 2020), show the popularity of coastal areas (Kubo et al., 2020), but are also suitable for improving transport planning (Hu et al., 2020). Because the movement of residents is closely related to the infrastructure (Milojevic-Dupont et al., 2020), it is a very valuable input in urban planning.
The data of the Internet of Things sensors (Mabrouki et al., 2021) allow the conclusions drawn from the satellite images to be verified (Majidi et al., 2021), as a measuring station (Jimenez et al., 2019) increases the number of observations, thus better downscaling solutions (Wang Q. et al., 2020) can be made. It can be used for causal exploration of plant morphological damage (Fenu and Malloci, 2019) and supports agricultural irrigation water demand planning (Ismail et al., 2020), but can also be imported into flood models (Avand et al., 2021).
In the Big Data application, that supports the energy demand management of buildings (Gouveia and Palma, 2019), we can use water consumption data (Mourtzios et al., 2021) as an extension, development alternatives can be ranked based on time series data (Ise et al., 2020), or based on time series extracted from satellite images (Joshi et al., 2019), which can be supported by a deeper understanding of energy demand downscaled data (Wang Q. et al., 2020), because the resolution of the input data can be improved (Jimenez et al., 2019).
Based on the presented system of systems framework, it can be seen how the new results of Big Data applications related to climate change contribute to other areas. Remote sensing of water consumption (Mourtzios et al., 2021), analysis of cloud water content (He et al., 2020), and the agricultural water management model (Ismail et al., 2020) contribute to the goal of clean water and sanitation (SDG6). Planning based on the analysis of traffic data (Hu et al., 2020), studying population movements (Gurram et al., 2019) and flooding models (Avand et al., 2021) support the goal of industry, innovation and infrastructure (SDG9). Climate-friendly urban planning (Milojevic-Dupont et al., 2020), monitoring the energy demand of buildings (Gouveia and Palma, 2019), and defining disaster resilience (Sasaki et al., 2020) play an important role in achieving sustainable cities and communities (SDG11). The Climate Action goal (SDG13) tackles most data gaps, so research such as linking satellite images to Semlali and El Amrani (2021) with air quality, preprocessing them (Meraner et al., 2020;Qin and Chi, 2020;Semlali et al., 2020), the analysis of time series data (Ise et al., 2020) and its exploration (Joshi et al., 2019), downscaling (Wang Q. et al., 2020) techniques, enrichment of precipitation and temperature data (Jimenez et al., 2019), tracking the movement of clouds , or just using IoT sensors (Mabrouki et al., 2021) are all key in creating a strategy to support the achievement of the climate goal. For the sustainability of life below water (SDG14), marine life prediction models (Coro et al., 2020) and human coastal activity (Kubo et al., 2020) can be integrated. Of course, the goal of life on land (SDG15) also requires new research, where a satellite-based study of agriculture and forestry (Majidi et al., 2021), deployment of IoT sensors (Lambrinos, 2019), analysis of climatic factors of potato damage (Fenu and Malloci, 2019), studying the morphology of plants (Meineke et al., 2020), or social media based illustration of palm oil consumption (Teng et al., 2020) are promising. Partnerships for the goals (SDG17) is critical in several ways, on the one hand we recommend the grouping of climate services (Howard et al., 2020), which fits the SoS concept we propose, and on the other hand we need to integrate the knowledge and give feedback to society. An exciting tool for measuring the effectiveness of climate and sustainability related measures is the analysis of news comments (Park et al., 2020).
It is essential to highlight that Big Data research on climate change can be used in other areas and as shown by the SDG grouping in Figure 5. Thus, based on the recommended SoS viewpoint, the specific results of sustainability-related research and development projects can be integrated, enhancing knowledge accumulation and utilization.

DISCUSSION
This paper described the essential need for research and development objectives to realize and manage the complex issues of climate change through Big Data tools. Data-driven applications were reviewed through the co-occurrence analysis of keywords, which showed the widespread application of Big Data technologies and tools, however, comprehensively utilized and integrative analyses are less prevalent.
This research aimed to highlight the perspective of systems of systems (SoS) as the drivers and effects of climate as well as that their resilience and adaptation cannot be determined without the exploration of the synergies between new research trends and disciplines. Based on the recommended SoS viewpoint, the specific results of sustainability-related research and development projects can be integrated, enhancing knowledge accumulation and utilization. The tools of data and systems sciences can play a crucial role in recognition of climate challenges and mitigation opportunities thanks to the integration of heterogeneous data and models, and the exploration of the relationship between environmental and social factors. This integrated thinking lays the groundwork for promising future trends in climate computing.
It can be claimed that the exclusive analysis of climatic factors cannot bring about sufficient strategic adaptation by itself, rather the socio-environmental factors must be integrated the climate change models.
Mitigating the impacts of climate change and successful adaptation requires effective climate change strategic planning by countries worldwide whose decision-making requires complex models and sources of information. The Big Data toolkit enables the systematization, processing, and evaluation of heterogeneous data and information sources, which is unfeasible with traditional disciplinary analysis tools. The harmonization of the ever-expanding scientific knowledge and diversified data sources related to climate change may be one of the most urgent tasks for researchers in the future. This research presented Big Data analytics tools and their contribution toward exploring the characteristics of climate change as well as climate action-related counterparts such as sustainability and social sciences that are essential for the successful development and implementation of strategies.

AUTHOR CONTRIBUTIONS
VS: conceptualization, validation, investigation, writing-original draft, visualization. JA: conceptualization, validation, resources, writing-review and editing, supervision, and funding acquisition. TC: writing-original draft, investigation, visualization, and validation. All authors contributed to the article and approved the submitted version.