<?xml version="1.0" encoding="utf-8"?>
    <rss version="2.0">
      <channel xmlns:content="http://purl.org/rss/1.0/modules/content/">
        <title>Frontiers in Big Data | Data-driven Climate Sciences section | New and Recent Articles</title>
        <link>https://www.frontiersin.org/journals/big-data/sections/data-driven-climate-sciences</link>
        <description>RSS Feed for Data-driven Climate Sciences section in the Frontiers in Big Data journal | New and Recent Articles</description>
        <language>en-us</language>
        <generator>Frontiers Feed Generator,version:1</generator>
        <pubDate>2026-04-04T12:01:30.379+00:00</pubDate>
        <ttl>60</ttl>
        <item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1710462</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1710462</link>
        <title><![CDATA[Inferring causal interplay between air pollution and meteorology]]></title>
        <pubdate>2025-12-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Yves Philippe Rybarczyk</author><author>Niralkumar Hemantbhai Dave</author><author>Tobias Isaac Tapia-Flores</author><author>Rasa Zalakeviciute</author>
        <description><![CDATA[IntroductionThis study investigates the bidirectional causal interplay between PM2.5 and relative humidity (RH) in Quito, Ecuador. Focusing on a high-altitude city with complex terrain, the objective is to understand pollution-climate feedbacks over a two-decade span.MethodsThe study employs Convergent Cross Mapping (CCM), a nonlinear empirical dynamic modeling approach. Hourly data were analyzed across four districts in Quito across two distinct time periods: 2004–2005 versus 2022–2024. Robustness of causality was confirmed using surrogate testing techniques.ResultsThe analysis reveals statistically significant, nonlinear, and time-variant couplings. While RH influenced PM2.5 in the early 2000s, the relationship inverted, with PM2.5 increasingly driving RH by the early 2020s. Partial-derivative analyses indicate shifting interaction signs and strengths. Notably, pollution was found to increasingly suppress RH, particularly in northern districts.DiscussionThe observed suppression of RH by pollution is consistent with urban heat island amplification and radiative effects. These findings underscore the necessity of nonlinear causality frameworks for understanding environmental feedbacks in complex terrains. The study highlights the need for integrated air quality and climate strategies. Future research should expand variables and monitoring sites to further generalize these findings.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1611364</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1611364</link>
        <title><![CDATA[Editorial: Air quality and biosphere-atmosphere interactions]]></title>
        <pubdate>2025-05-06T00:00:00Z</pubdate>
        <category>Editorial</category>
        <author>Yves Philippe Rybarczyk</author>
        <description></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1546223</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1546223</link>
        <title><![CDATA[Causal effect of PM2.5 on the urban heat island]]></title>
        <pubdate>2025-03-14T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Yves Rybarczyk</author><author>Rasa Zalakeviciute</author><author>Marija Ereminaite</author><author>Ivana Costa-Stolz</author>
        <description><![CDATA[The planet is experiencing global warming, with an increasing number of heat waves worldwide. Cities are particularly affected by the high temperatures because of the urban heat island (UHI) effect. This phenomenon is mostly explained by the land cover changes, reduced green spaces, and the concentration of infrastructure in urban settings. However, the reasons for the UHI are complex and involve multiple factors still understudied. Air pollution is one of them. This work investigates the link between particulate matter ≤2.5 μm (PM2.5) and air temperature by convergent cross-mapping (CCM), a statistical method to infer causation in dynamic non-linear systems. A positive correlation between the concentration of fine particulate matter and urban temperature is observed. The causal relationship between PM2.5 and temperature is confirmed in the most urbanized areas of the study site (Quito, Ecuador). The results show that (i) the UHI is present even in the most elevated capital city of the world, and (ii) air quality is an important contributor to the higher temperatures in urban than outlying areas. This study supports the hypothesis of a non-linear threshold effect of pollution concentration on urban temperature.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2025.1507036</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2025.1507036</link>
        <title><![CDATA[Deep learning for accurate classification of conifer pollen grains: enhancing species identification in palynology]]></title>
        <pubdate>2025-02-14T00:00:00Z</pubdate>
        <category>Methods</category>
        <author>Masoud A. Rostami</author><author>LeMaur Kydd</author><author>Behnaz Balmaki</author><author>Lee A. Dyer</author><author>Julie M. Allen</author>
        <description><![CDATA[Accurate identification of pollen grains from Abies (fir), Picea (spruce), and Pinus (pine) is an important method for reconstructing historical environments, past landscapes and understanding human-environment interactions. However, distinguishing between pollen grains of conifer genera poses challenges in palynology due to their morphological similarities. To address this identification challenge, this study leverages advanced deep learning techniques, specifically transfer learning models, which are effective in identifying similarities among detailed features. We evaluated nine different transfer learning architectures: DenseNet201, EfficientNetV2S, InceptionV3, MobileNetV2, ResNet101, ResNet50, VGG16, VGG19, and Xception. Each model was trained and validated on a dataset of images of pollen grains collected from museum specimens, mounted and imaged for training purposes. The models were assessed on various performance metrics, including accuracy, precision, recall, and F1-score across training, validation, and testing phases. Our results indicate that ResNet101 relatively outperformed other models, achieving a test accuracy of 99%, with equally high precision, recall, and F1-score. This study underscores the efficacy of transfer learning to produce models that can aid in identifications of difficult species. These models may aid conifer species classification and enhance pollen grain analysis, critical for ecological research and monitoring environmental changes.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2024.1375455</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2024.1375455</link>
        <title><![CDATA[Real driving cycles and emissions for urban freight transport]]></title>
        <pubdate>2024-07-08T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Julie Anne Holanda Azevedo</author><author>Demostenis Ramos Cassiano</author><author>Bruno Vieira Bertoncini</author>
        <description><![CDATA[This paper aims to evaluate the driving style effects, through the construction of driving cycles, on the polluting gases, in the context of urban freight transportation. For this, the method used was the construction of cycles through the Vehicle Specific Power (VSP) parameter, which considers instantaneous vehicle and road parameters better to represent driving patterns and freight transportation's environmental impacts. The study was conducted in Fortaleza city, Ceará, Brazil, with a professional driver's group. The road types, land use and traffic light location were considered to analyze and discuss the results. The results show collector roads presented higher speeds than arterial roads, and the use of the land around the road also directly impacted vehicle driving patterns. Regarding CO2 emissions, higher concentrations measured were observed on the arterial roads.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2024.1412837</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2024.1412837</link>
        <title><![CDATA[Particulate matter forecast and prediction in Curitiba using machine learning]]></title>
        <pubdate>2024-05-30T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Marianna Gonçalves Dias Chaves</author><author>Adriel Bilharva da Silva</author><author>Emílio Graciliano Ferreira Mercuri</author><author>Steffen Manfred Noe</author>
        <description><![CDATA[IntroductionAir quality is directly affected by pollutant emission from vehicles, especially in large cities and metropolitan areas or when there is no compliance check for vehicle emission standards. Particulate Matter (PM) is one of the pollutants emitted from fuel burning in internal combustion engines and remains suspended in the atmosphere, causing respiratory and cardiovascular health problems to the population. In this study, we analyzed the interaction between vehicular emissions, meteorological variables, and particulate matter concentrations in the lower atmosphere, presenting methods for predicting and forecasting PM2.5.MethodsMeteorological and vehicle flow data from the city of Curitiba, Brazil, and particulate matter concentration data from optical sensors installed in the city between 2020 and 2022 were organized in hourly and daily averages. Prediction and forecasting were based on two machine learning models: Random Forest (RF) and Long Short-Term Memory (LSTM) neural network. The baseline model for prediction was chosen as the Multiple Linear Regression (MLR) model, and for forecast, we used the naive estimation as baseline.ResultsRF showed that on hourly and daily prediction scales, the planetary boundary layer height was the most important variable, followed by wind gust and wind velocity in hourly or daily cases, respectively. The highest PM prediction accuracy (99.37%) was found using the RF model on a daily scale. For forecasting, the highest accuracy was 99.71% using the LSTM model for 1-h forecast horizon with 5 h of previous data used as input variables.DiscussionThe RF and LSTM models were able to improve prediction and forecasting compared with MLR and Naive, respectively. The LSTM was trained with data corresponding to the period of the COVID-19 pandemic (2020 and 2021) and was able to forecast the concentration of PM2.5 in 2022, in which the data show that there was greater circulation of vehicles and higher peaks in the concentration of PM2.5. Our results can help the physical understanding of factors influencing pollutant dispersion from vehicle emissions at the lower atmosphere in urban environment. This study supports the formulation of new government policies to mitigate the impact of vehicle emissions in large cities.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2024.1384240</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2024.1384240</link>
        <title><![CDATA[Tradescantia response to air and soil pollution, stamen hair cells dataset and ANN color classification]]></title>
        <pubdate>2024-05-15T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Leatrice Talita Rodrigues</author><author>Barbara Sanches Antunes Goeldner</author><author>Emílio Graciliano Ferreira Mercuri</author><author>Steffen Manfred Noe</author>
        <description><![CDATA[Tradescantia plant is a complex system that is sensible to environmental factors such as water supply, pH, temperature, light, radiation, impurities, and nutrient availability. It can be used as a biomonitor for environmental changes; however, the bioassays are time-consuming and have a strong human interference factor that might change the result depending on who is performing the analysis. We have developed computer vision models to study color variations from Tradescantia clone 4430 plant stamen hair cells, which can be stressed due to air pollution and soil contamination. The study introduces a novel dataset, Trad-204, comprising single-cell images from Tradescantia clone 4430, captured during the Tradescantia stamen-hair mutation bioassay (Trad-SHM). The dataset contain images from two experiments, one focusing on air pollution by particulate matter and another based on soil contaminated by diesel oil. Both experiments were carried out in Curitiba, Brazil, between 2020 and 2023. The images represent single cells with different shapes, sizes, and colors, reflecting the plant's responses to environmental stressors. An automatic classification task was developed to distinguishing between blue and pink cells, and the study explores both a baseline model and three artificial neural network (ANN) architectures, namely, TinyVGG, VGG-16, and ResNet34. Tradescantia revealed sensibility to both air particulate matter concentration and diesel oil in soil. The results indicate that Residual Network architecture outperforms the other models in terms of accuracy on both training and testing sets. The dataset and findings contribute to the understanding of plant cell responses to environmental stress and provide valuable resources for further research in automated image analysis of plant cells. Discussion highlights the impact of turgor pressure on cell shape and the potential implications for plant physiology. The comparison between ANN architectures aligns with previous research, emphasizing the superior performance of ResNet models in image classification tasks. Artificial intelligence identification of pink cells improves the counting accuracy, thus avoiding human errors due to different color perceptions, fatigue, or inattention, in addition to facilitating and speeding up the analysis process. Overall, the study offers insights into plant cell dynamics and provides a foundation for future investigations like cells morphology change. This research corroborates that biomonitoring should be considered as an important tool for political actions, being a relevant issue in risk assessment and the development of new public policies relating to the environment.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2023.1243559</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2023.1243559</link>
        <title><![CDATA[Deep learning estimation of northern hemisphere soil freeze-thaw dynamics using satellite multi-frequency microwave brightness temperature observations]]></title>
        <pubdate>2023-11-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Kellen Donahue</author><author>John S. Kimball</author><author>Jinyang Du</author><author>Fredrick Bunt</author><author>Andreas Colliander</author><author>Mahta Moghaddam</author><author>Jesse Johnson</author><author>Youngwook Kim</author><author>Michael A. Rawlins</author>
        <description><![CDATA[Satellite microwave sensors are well suited for monitoring landscape freeze-thaw (FT) transitions owing to the strong brightness temperature (TB) or backscatter response to changes in liquid water abundance between predominantly frozen and thawed conditions. The FT retrieval is also a sensitive climate indicator with strong biophysical importance. However, retrieval algorithms can have difficulty distinguishing the FT status of soils from that of overlying features such as snow and vegetation, while variable land conditions can also degrade performance. Here, we applied a deep learning model using a multilayer convolutional neural network driven by AMSR2 and SMAP TB records, and trained on surface (~0–5 cm depth) soil temperature FT observations. Soil FT states were classified for the local morning (6 a.m.) and evening (6 p.m.) conditions corresponding to SMAP descending and ascending orbital overpasses, mapped to a 9 km polar grid spanning a five-year (2016–2020) record and Northern Hemisphere domain. Continuous variable estimates of the probability of frozen or thawed conditions were derived using a model cost function optimized against FT observational training data. Model results derived using combined multi-frequency (1.4, 18.7, 36.5 GHz) TBs produced the highest soil FT accuracy over other models derived using only single sensor or single frequency TB inputs. Moreover, SMAP L-band (1.4 GHz) TBs provided enhanced soil FT information and performance gain over model results derived using only AMSR2 TB inputs. The resulting soil FT classification showed favorable and consistent performance against soil FT observations from ERA5 reanalysis (mean percent accuracy, MPA: 92.7%) and in situ weather stations (MPA: 91.0%). The soil FT accuracy was generally consistent between morning and afternoon predictions and across different land covers and seasons. The model also showed better FT accuracy than ERA5 against regional weather station measurements (91.0% vs. 86.1% MPA). However, model confidence was lower in complex terrain where FT spatial heterogeneity was likely beneath the effective model grain size. Our results provide a high level of precision in mapping soil FT dynamics to improve understanding of complex seasonal transitions and their influence on ecological processes and climate feedbacks, with the potential to inform Earth system model predictions.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2023.1139918</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2023.1139918</link>
        <title><![CDATA[Assessment of geothermal resource potential in Changbaishan utilizing high-precision gravity-based man-machine interactive inversion technology]]></title>
        <pubdate>2023-07-17T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Zhi-He Xu</author><author>Ji-Yi Jiang</author><author>Guan-Wen Gu</author><author>Zhen-Jun Sun</author><author>Xuan-Kai Jiao</author><author>Xing-Guo Niu</author><author>Qin Yu</author>
        <description><![CDATA[As one of the clean energy sources, geothermal resources have no negative impact in changing the climate. However, the accurate assessment and precise identification of the potential geothermal resource is still complex and dynamic. In this paper, ~2,000 large-scale high-precision gravity survey points are conducted in the north of the Tianchi caldera, Changbaishan. Advanced data processing technologies can provide straightforward information on deep geothermal resources (Hot source, caprock, geothermal reservoir and geothermal migration pathway). Upwards continuation and the technologies decode two dome shaped low and gentle anomalies (−48 × 10−5 m/s2−65 m/s2) and a positive gravity gradient anomaly (0.4 × 10−7 m/s2−1.6 × 10−5 m/s2) in large-scale high-precision gravity planar. According to two point five dimensional man-machine interactive inversion technology and the research on petrophysical parameters, the density of the shied-forming basalts in the two orthogonal gravity sections is 2.58 g/cm3. The relatively intermediate to high density (2.60–2.75 g/cm3) represents geothermal reservoir, and low density (low to 2.58 g/cm3) is the geothermal migration pathway. In addition, large-scale high-precision gravity planar with a solution of about 1/50,000 indicate that the north of the Tianchi caldera exits the sedimentary basin and uplift mountain geothermal system.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2023.1124148</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2023.1124148</link>
        <title><![CDATA[Machine learning-based ozone and PM2.5 forecasting: Application to multiple AQS sites in the Pacific Northwest]]></title>
        <pubdate>2023-02-24T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Kai Fan</author><author>Ranil Dhammapala</author><author>Kyle Harrington</author><author>Brian Lamb</author><author>Yunha Lee</author>
        <description><![CDATA[Air quality in the Pacific Northwest (PNW) of the U.S has generally been good in recent years, but unhealthy events were observed due to wildfires in summer or wood burning in winter. The current air quality forecasting system, which uses chemical transport models (CTMs), has had difficulty forecasting these unhealthy air quality events in the PNW. We developed a machine learning (ML) based forecasting system, which consists of two components, ML1 (random forecast classifiers and multiple linear regression models) and ML2 (two-phase random forest regression model). Our previous study showed that the ML system provides reliable forecasts of O3 at a single monitoring site in Kennewick, WA. In this paper, we expand the ML forecasting system to predict both O3 in the wildfire season and PM2.5 in wildfire and cold seasons at all available monitoring sites in the PNW during 2017–2020, and evaluate our ML forecasts against the existing operational CTM-based forecasts. For O3, both ML1 and ML2 are used to achieve the best forecasts, which was the case in our previous study: ML2 performs better overall (R2 = 0.79), especially for low-O3 events, while ML1 correctly captures more high-O3 events. Compared to the CTM-based forecast, our O3 ML forecasts reduce the normalized mean bias (NMB) from 7.6 to 2.6% and normalized mean error (NME) from 18 to 12% when evaluating against the observation. For PM2.5, ML2 performs the best and thus is used for the final forecasts. Compared to the CTM-based PM2.5, ML2 clearly improves PM2.5 forecasts for both wildfire season (May to September) and cold season (November to February): ML2 reduces NMB (−27 to 7.9% for wildfire season; 3.4 to 2.2% for cold season) and NME (59 to 41% for wildfires season; 67 to 28% for cold season) significantly and captures more high-PM2.5 events correctly. Our ML air quality forecast system requires fewer computing resources and fewer input datasets, yet it provides more reliable forecasts than (if not, comparable to) the CTM-based forecast. It demonstrates that our ML system is a low-cost, reliable air quality forecasting system that can support regional/local air quality management.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.997447</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.997447</link>
        <title><![CDATA[Flood risk assessment for residences at the neighborhood scale by owner/occupant type and first-floor height]]></title>
        <pubdate>2023-01-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Ayat Al Assi</author><author>Rubayet Bin Mostafiz</author><author>Carol J. Friedland</author><author>Md Adilur Rahim</author><author>Robert V. Rohli</author>
        <description><![CDATA[Evaluating flood risk is an essential component of understanding and increasing community resilience. A robust approach for quantifying flood risk in terms of average annual loss (AAL) in dollars across multiple homes is needed to provide valuable information for stakeholder decision-making. This research develops a computational framework to evaluate AAL at the neighborhood level by owner/occupant type (i.e., homeowner, landlord, and tenant) for increasing first-floor height (FFH). The AAL values were calculated here by numerically integrating loss-exceedance probability distributions to represent economic annual flood risk to the building, contents, and use. A simple case study for a census block in Jefferson Parish, Louisiana, revealed that homeowners bear a mean AAL of $4,390 at the 100-year flood elevation (E100), compared with $2,960, and $1,590 for landlords and tenants, respectively, because the homeowner incurs losses to building, contents, and use, rather than only two of the three, as for the landlord and tenant. The results of this case study showed that increasing FFH reduces AAL proportionately for each owner/occupant type, and that two feet of additional elevation above E100 may provide the most economically advantageous benefit. The modeled results suggested that Hazus Multi-Hazard (Hazus-MH) output underestimates the AAL by 11% for building and 15% for contents. Application of this technique while partitioning the owner/occupant types will improve planning for improved resilience and assessment of impacts attributable to the costly flood hazard.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.1009158</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.1009158</link>
        <title><![CDATA[Changing features of the Northern Hemisphere 500-hPa circumpolar vortex]]></title>
        <pubdate>2023-01-09T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Nazla Bushra</author><author>Robert V. Rohli</author><author>Chunyan Li</author><author>Paul W. Miller</author><author>Rubayet Bin Mostafiz</author>
        <description><![CDATA[The tropospheric circumpolar vortex (CPV), an important signature of processes steering the general atmospheric circulation, surrounds each pole and is linked to the surface weather conditions. The CPV can be characterized by its area and circularity ratio (Rc), which both vary temporally. This research advances previous work identifying the daily 500-hPa Northern Hemispheric CPV (NHCPV) area, Rc, and temporal trends in its centroid by examining linear trends and periodic cycles in NHCPV area and Rc (1979–2017). Results suggest that NHCPV area has increased linearly over time. However, a more representative signal of the planetary warming may be the temporally weakening gradient which has blurred NHCPV distinctiveness—perhaps a new indicator of Arctic amplification. Rc displays opposing trends in subperiods and an insignificant overall trend. Distinct annual and semiannual cycles exist for area and Rc over all subperiods. These features of NHCPV change over time may impact surface weather/climate.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.1022900</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.1022900</link>
        <title><![CDATA[A data-driven spatial approach to characterize the flood hazard]]></title>
        <pubdate>2022-12-12T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Rubayet Bin Mostafiz</author><author>Md Adilur Rahim</author><author>Carol J. Friedland</author><author>Robert V. Rohli</author><author>Nazla Bushra</author><author>Fatemeh Orooji</author>
        <description><![CDATA[Model output of localized flood grids are useful in characterizing flood hazards for properties located in the Special Flood Hazard Area (SFHA—areas expected to experience a 1% or greater annual chance of flooding). However, due to the unavailability of higher return-period [i.e., recurrence interval, or the reciprocal of the annual exceedance probability (AEP)] flood grids, the flood risk of properties located outside the SFHA cannot be quantified. Here, we present a method to estimate flood hazards that are located both inside and outside the SFHA using existing AEP surfaces. Flood hazards are characterized by the Gumbel extreme value distribution to project extreme flood event elevations for which an entire area is assumed to be submerged. Spatial interpolation techniques impute flood elevation values and are used to estimate flood hazards for areas outside the SFHA. The proposed method has the potential to improve the assessment of flood risk for properties located both inside and outside the SFHA and therefore to improve the decision-making process regarding flood insurance purchases, mitigation strategies, and long-term planning for enhanced resilience to one of the world's most ubiquitous natural hazards.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.967477</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.967477</link>
        <title><![CDATA[Observation-based assessment of secondary water effects on seasonal vegetation decay across Africa]]></title>
        <pubdate>2022-09-09T00:00:00Z</pubdate>
        <category>Brief Research Report</category>
        <author>Çağlar Küçük</author><author>Sujan Koirala</author><author>Nuno Carvalhais</author><author>Diego G. Miralles</author><author>Markus Reichstein</author><author>Martin Jung</author>
        <description><![CDATA[Local studies and modeling experiments suggest that shallow groundwater and lateral redistribution of soil moisture, together with soil properties, can be highly important secondary water sources for vegetation in water-limited ecosystems. However, there is a lack of observation-based studies of these terrain-associated secondary water effects on vegetation over large spatial domains. Here, we quantify the role of terrain properties on the spatial variations of dry season vegetation decay rate across Africa obtained from geostationary satellite acquisitions to assess the large-scale relevance of secondary water effects. We use machine learning based attribution to identify where and under which conditions terrain properties related to topography, water table depth, and soil hydraulic properties influence the rate of vegetation decay. Over the study domain, the machine learning model attributes about one-third of the spatial variations of vegetation decay rates to terrain properties, which is roughly equally split between direct terrain effects and interaction effects with climate and vegetation variables. The importance of secondary water effects increases with increasing topographic variability, shallower groundwater levels, and the propensity to capillary rise given by soil properties. In regions with favorable terrain properties, more than 60% of the variations in the decay rate of vegetation are attributed to terrain properties, highlighting the importance of secondary water effects on vegetation in Africa. Our findings provide an empirical assessment of the importance of local-scale secondary water effects on vegetation over Africa and help to improve hydrological and vegetation models for the challenge of bridging processes across spatial scales.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.768676</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.768676</link>
        <title><![CDATA[Application of a Machine Learning Algorithm in Generating an Evapotranspiration Data Product From Coupled Thermal Infrared and Microwave Satellite Observations]]></title>
        <pubdate>2022-05-20T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Li Fang</author><author>Xiwu Zhan</author><author>Satya Kalluri</author><author>Peng Yu</author><author>Chris Hain</author><author>Martha Anderson</author><author>Istvan Laszlo</author>
        <description><![CDATA[Land surface evapotranspiration (ET) is one of the main energy sources for atmospheric dynamics and a critical component of the local, regional, and global water cycles. Consequently, accurate measurement or estimation of ET is one of the most active topics in hydro-climatology research. With massive and spatially distributed observational data sets of land surface properties and environmental conditions being collected from the ground, airborne or space-borne platforms daily over the past few decades, many research teams have started to use big data science to advance the ET estimation methods. The Geostationary satellite Evapotranspiration and Drought (GET-D) product system was developed at the National Oceanic and Atmospheric Administration (NOAA) in 2016 to generate daily ET and drought maps operationally. The primary inputs of the current GET-D system are the thermal infrared (TIR) observations from NOAA GOES satellite series. Because of the cloud contamination to the TIR observations, the spatial coverage of the daily GET-D ET product has been severely impacted. Based on the most recent advances, we have tested a machine learning algorithm to estimate all-weather land surface temperature (LST) from TIR and microwave (MW) combined satellite observations. With the regression tree machine learning approach, we can combine the high accuracy and high spatial resolution of GOES TIR data with the better spatial coverage of passive microwave observations and LST simulations from a land surface model (LSM). The regression tree model combines the three LST data sources for both clear and cloudy days, which enables the GET-D system to derive an all-weather ET product. This paper reports how the all-weather LST and ET are generated in the upgraded GET-D system and provides an evaluation of these LST and ET estimates with ground measurements. The results demonstrate that the regression tree machine learning method is feasible and effective for generating daily ET under all weather conditions with satisfactory accuracy from the big volume of satellite observations.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.898643</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.898643</link>
        <title><![CDATA[Editorial: Statistical Learning for Predicting Air Quality]]></title>
        <pubdate>2022-05-05T00:00:00Z</pubdate>
        <category>Editorial</category>
        <author>Yves Philippe Rybarczyk</author><author>Rasa Zalakeviciute</author>
        <description></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.842455</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.842455</link>
        <title><![CDATA[Deep Learning Approach for Assessing Air Quality During COVID-19 Lockdown in Quito]]></title>
        <pubdate>2022-04-04T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Phuong N. Chau</author><author>Rasa Zalakeviciute</author><author>Ilias Thomas</author><author>Yves Rybarczyk</author>
        <description><![CDATA[Weather Normalized Models (WNMs) are modeling methods used for assessing air contaminants under a business-as-usual (BAU) assumption. Therefore, WNMs are used to assess the impact of many events on urban pollution. Recently, different approaches have been implemented to develop WNMs and quantify the lockdown effects of COVID-19 on air quality, including Machine Learning (ML). However, more advanced methods, such as Deep Learning (DL), have never been applied for developing WNMs. In this study, we proposed WNMs based on DL algorithms, aiming to test five DL architectures and compare their performances to a recent ML approach, namely Gradient Boosting Machine (GBM). The concentrations of five air pollutants (CO, NO2, PM2.5, SO2, and O3) are studied in the city of Quito, Ecuador. The results show that Long-Short Term Memory (LSTM) and Bidirectional Recurrent Neural Network (BiRNN) outperform the other algorithms and, consequently, are recommended as appropriate WNMs to quantify the effects of the lockdowns on air pollution. Furthermore, examining the variable importance in the LSTM and BiRNN models, we identify that the most relevant temporal and meteorological features for predicting air quality are Hours (time of day), Index (1 is the first collected data and increases by one after each instance), Julian Day (day of the year), Relative Humidity, Wind Speed, and Solar Radiation. During the full lockdown, the concentration of most pollutants has decreased drastically: −48.75%, for CO, −45.76%, for SO2, −42.17%, for PM2.5, and −63.98%, for NO2. The reduction of this latter gas has induced an increase of O3 by +26.54%.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.822573</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.822573</link>
        <title><![CDATA[Data-Driven Framework for Understanding and Predicting Air Quality in Urban Areas]]></title>
        <pubdate>2022-03-25T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Lakshmi Babu Saheer</author><author>Ajay Bhasy</author><author>Mahdi Maktabdar</author><author>Javad Zarrin</author>
        <description><![CDATA[Monitoring, predicting, and controlling the air quality in urban areas is one of the effective solutions for tackling the climate change problem. Leveraging the availability of big data in different domains like pollutant concentration, urban traffic, aerial imagery of terrains and vegetation, and weather conditions can aid in understanding the interactions between these factors and building a reliable air quality prediction model. This research proposes a novel cost-effective and efficient air quality modeling framework including all these factors employing state-of-the-art artificial intelligence techniques. The framework also includes a novel deep learning-based vegetation detection system using aerial images. The pilot study conducted in the UK city of Cambridge using the proposed framework investigates various predictive models ranging from statistical to machine learning and deep recurrent neural network models. This framework opens up possibilities of broadening air quality modeling and prediction to other domains like vegetation or green space planning or green traffic routing for sustainable urban cities. The research is mainly focused on extracting strong pieces of evidence which could be useful in proposing better policies around climate change.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2022.826517</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2022.826517</link>
        <title><![CDATA[Air Quality Forecast by Statistical Methods: Application to Portugal and Macao]]></title>
        <pubdate>2022-03-10T00:00:00Z</pubdate>
        <category>Brief Research Report</category>
        <author>Luísa Mendes</author><author>Joana Monjardino</author><author>Francisco Ferreira</author>
        <description><![CDATA[Air pollution is a major concern issue for most countries in the world. In Portugal and Macao, the values of nitrogen dioxide (NO2), particulate matter (PM) and ozone (O3) are frequently above the concentration thresholds accepted as “good air quality.” Portugal follows the European Union (EU) legislation (Directive 2008/50/EC) on air quality and Macao the air quality guidelines (AQG) from the WHO. Air quality forecasts are very important mitigation tools because of their ability to anticipate pollution events, and issue early warnings, allowing to take preventive measures and reduce impacts, by avoiding exposure. The work presented here refers to the statistical forecast of air pollutants for three regions: Greater Lisbon Area, Madeira Autonomous Region (both located in Portugal), and Macao Special Administrative Region (in Southern China). The presented statistical approach combines Classification and Regression Tree (CART) and multiple regression (MR) analysis to obtain optimized regression models. This consolidated methodology is now in operation for more than a decade in Portugal, and is subject to regular updates that reflect the ongoing research and the changes in the air quality monitoring network. Recently, the same methodology was applied to Macao in collaboration with the Macao Meteorological and Geophysical Bureau (SMG). Here, a statistical approach for air quality forecasting is described that has been proven to be successful, being able to forecast PM10, PM2.5, NO2, and O3 concentrations, for the next day, with a good performance. In general, all the models have shown a good agreement between the observed and forecasted concentrations (with R2 from 0.50 to 0.89), and were able to follow the concentration evolution trend. For some cases, there is a slight delay in the prediction trend. Moreover, the results obtained for pollution episodes have proven that statistical forecast can be an effective way of protecting public health.]]></description>
      </item><item>
        <guid isPermaLink="true">https://www.frontiersin.org/articles/10.3389/fdata.2021.777336</guid>
        <link>https://www.frontiersin.org/articles/10.3389/fdata.2021.777336</link>
        <title><![CDATA[Daily Spatial Complete Soil Moisture Mapping Over Southeast China Using CYGNSS and MODIS Data]]></title>
        <pubdate>2022-02-15T00:00:00Z</pubdate>
        <category>Original Research</category>
        <author>Ting Yang</author><author>Zhigang Sun</author><author>Jundong Wang</author><author>Sen Li</author>
        <description><![CDATA[Daily spatial complete soil moisture (SM) mapping is important for climatic, hydrological, and agricultural applications. The Cyclone Global Navigation Satellite System (CYGNSS) is the first constellation that utilizes the L band signal transmitted by the Global Navigation Satellite System (GNSS) satellites to measure SM. Since the CYGNSS points are discontinuously distributed with a relativity low density, limiting it to map continuous SM distributions with high accuracy. The Moderate-Resolution Imaging Spectroradiometer (MODIS) product (i.e., vegetation index [VI] and land surface temperature [LST]) provides more surface SM information than other optical remote sensing data with a relatively high spatial resolution. This study proposes a point-surface fusion method to fuse the CYGNSS and MODIS data for daily spatial complete SM retrieval. First, for CYGNSS data, the surface reflectivity (SR) is proposed as a proxy to evaluate its ability to estimate daily SM. Second, the LST output from the China Meteorological Administration Land Data Assimilation System (CLDAS, 0.0625° × 0.0625°) and MODIS LST (1 × 1 km) are fused to generate spatial complete and temporally continuous LST maps. An Enhanced Normalized Vegetation Supply Water Index (E-NVSWI) model is proposed to estimate SM derived from MODIS data at high spatial resolution. Finally, the final SM estimation model is constructed from the back-propagation artificial neural network (BP-ANN) fusing the CYGNSS point, E-NWSVI data, and ancillary data, and applied to get the daily continuous SM result over southeast China. The results show that the estimation SM are comparable and promising (R = 0.723, root mean squared error [RMSE] = 0.062 m3 m−3, and MAE = 0.040 m3 m−3 vs. in situ, R = 0.714, RMSE = 0.057 m3 m−3, and MAE = 0.039 m3 m−3 vs. CLDAS). The proposed algorithm contributes from two aspects: (1) validates the CYGNSS derived SM by taking advantage of the dense in situ networks over Southeast China; (2) provides a point-surface fusion model to combine the usage of CYGNSS and MODIS to generate the temporal and spatial complete SM. The proposed approach reveals significant potential to map daily spatial complete SM using CYGNSS and MODIS data at a regional scale.]]></description>
      </item>
      </channel>
    </rss>