Statistical Learning for Predicting Air Quality

29.6K

views

18

authors

5

articles

Statistical Learning for Predicting Air Quality

29.6K

views

18

authors

5

articles

Editorial

05 May 2022

Editorial: Statistical Learning for Predicting Air Quality

Yves Philippe Rybarczyk

and

Rasa Zalakeviciute

2,494 views

1 citations

Editors

2

Yves Philippe Rybarczyk

Dalarna University

Rasa Zalakeviciute

University of the Americas

Impact

About

The concentration of air pollutants is traditionally explained by complex physical and chemical processes of dispersion and advection. This is the reason why the prediction of air quality is usually addressed through deterministic models, such as Chemical Transport Models (CTMs).

However, the CTMs show several limitations and constraints. Their performance depends on an updated emission inventory of the urban area, which is often compromised in developing countries. They also struggle to make an accurate air pollution forecast in complex terrain regions. Finally, they require high computational power, in order to run time-consuming simulations.

More recently, statistical models based on Machine Learning (ML) algorithms have appeared as a valuable alternative to tackle many disadvantages of the CTMs. They seem particularly relevant to provide a fine resolution at an urban scale, where the estimation of air contamination is of the most importance for health concerns.

In that sense, ML could become the new paradigm for pollution forecasting. Nevertheless, ML has its own drawbacks, which may not transform the CTM into an obsolete technique. For instance, the accuracy of the ML-based models strongly relies on the volume of data. The bigger the dataset, the higher is the reliability of the prediction. Also, the downscaling procedure of the CTMs provides an additional level of explanation of the physical phenomena across multiple scales and resolutions, still not available with ML.

GOAL
An important goal of this Research Topic is to understand if ML can become the new standard for air quality prediction or if a hybrid modelling is the best approach, considering that it can take advantage of the complementarity between ML and CTM.

Among the several ML methods, we would like to identify which, if any, is the most suitable algorithm for atmospheric pollution forecasting. Such an assessment must consider all the dimensions of the prediction performance, which include both the accuracy and the interpretability of the models. For example, the non-linear models (e.g., ensemble learning or artificial neural networks) tend to be more accurate but less interpretable than a linear regression.

Advanced algorithms are extremely relevant, but they are just a part of the solution. Another key aspect is the quality of the selected features. This Research Topic intends to address the possible role of the new technologies (e.g., smartphone, internet of things, …) to provide unexplored features that could significantly impact the quality of the prediction, through a better spatial coverage of the pollution sources, for instance. A typical example is consuming road traffic data from web services, in order to account for motorized fleet emissions.

Data processing and fusion are also in the scope of this call. We welcome contributions that will provide an insight on the complementarity between widely spread low-cost sensors and/or proxy indicators and sparsely distributed high-resolution measurements from monitoring stations. This theme encompasses the data fusion between ground-level measurements and satellite images analysis (e.g., Aerosol Optical Depth).

Finally, this endeavour aims to define the upcoming priorities in terms of air quality forecasts, by considering regional differences between developed and developing countries regarding economic resources and the principal pollution sources. We will try to assess if the research carried out in the developed world is transferable to the less wealthy and the most polluted countries.

SCOPE
Based on the main goals previously mentioned, the specific topics covered by the Research Topic are listed below.

ML and CTM:

- Limitations of the ML approach.

- Complementarity between ML and CTM.

- Hybrid models based on an integration of CTM and ML.

Boosting the modelling:

- Advanced ML algorithms to predict pollution (e.g., deep learning).

- Data fusion.

- Original features.

Regional priorities:

- Affordable solutions for developing countries.

- Predicting the concentration of nanoparticles (e.g., PM1) in developed countries.

New challenges:

- Discussing the generalization of the current models. Proposing models that are applicable worldwide and not limited to a local region.

- Accuracy to predict high levels of pollution. Focus on the models that are able to forecast pollution peaks.

We are interested in a large spectrum of manuscripts that includes original research papers, applied research case studies, and literature reviews. Our intention is to foment the debate and encourage the different experts in environmental engineering and artificial intelligence to share their views and defend their respective approaches. We are also excited in finding out if the big data breakthrough can initiate a paradigm shift in the scientific research on air quality.

Photo credits: Yves Rybarczyk.

The concentration of air pollutants is traditionally explained by complex physical and chemical processes of dispersion and advection. This is the reason why the prediction of air quality is usually addressed through deterministic models, such as Chemical Transport Models (CTMs).

However, the CTMs show several limitations and constraints. Their performance depends on an updated emission inventory of the urban area, which is often compromised in developing countries. They also struggle to make an accurate air pollution forecast in complex terrain regions. Finally, they require high computational power, in order to run time-consuming simulations.

More recently, statistical models based on Machine Learning (ML) algorithms have appeared as a valuable alternative to tackle many disadvantages of the CTMs. They seem particularly relevant to provide a fine resolution at an urban scale, where the estimation of air contamination is of the most importance for health concerns.

In that sense, ML could become the new paradigm for pollution forecasting. Nevertheless, ML has its own drawbacks, which may not transform the CTM into an obsolete technique. For instance, the accuracy of the ML-based models strongly relies on the volume of data. The bigger the dataset, the higher is the reliability of the prediction. Also, the downscaling procedure of the CTMs provides an additional level of explanation of the physical phenomena across multiple scales and resolutions, still not available with ML.

GOAL
An important goal of this Research Topic is to understand if ML can become the new standard for air quality prediction or if a hybrid modelling is the best approach, considering that it can take advantage of the complementarity between ML and CTM.

Among the several ML methods, we would like to identify which, if any, is the most suitable algorithm for atmospheric pollution forecasting. Such an assessment must consider all the dimensions of the prediction performance, which include both the accuracy and the interpretability of the models. For example, the non-linear models (e.g., ensemble learning or artificial neural networks) tend to be more accurate but less interpretable than a linear regression.

Advanced algorithms are extremely relevant, but they are just a part of the solution. Another key aspect is the quality of the selected features. This Research Topic intends to address the possible role of the new technologies (e.g., smartphone, internet of things, …) to provide unexplored features that could significantly impact the quality of the prediction, through a better spatial coverage of the pollution sources, for instance. A typical example is consuming road traffic data from web services, in order to account for motorized fleet emissions.

Data processing and fusion are also in the scope of this call. We welcome contributions that will provide an insight on the complementarity between widely spread low-cost sensors and/or proxy indicators and sparsely distributed high-resolution measurements from monitoring stations. This theme encompasses the data fusion between ground-level measurements and satellite images analysis (e.g., Aerosol Optical Depth).

Finally, this endeavour aims to define the upcoming priorities in terms of air quality forecasts, by considering regional differences between developed and developing countries regarding economic resources and the principal pollution sources. We will try to assess if the research carried out in the developed world is transferable to the less wealthy and the most polluted countries.

SCOPE
Based on the main goals previously mentioned, the specific topics covered by the Research Topic are listed below.

ML and CTM:

- Limitations of the ML approach.

- Complementarity between ML and CTM.

- Hybrid models based on an integration of CTM and ML.

Boosting the modelling:

- Advanced ML algorithms to predict pollution (e.g., deep learning).

- Data fusion.

- Original features.

Regional priorities:

- Affordable solutions for developing countries.

- Predicting the concentration of nanoparticles (e.g., PM1) in developed countries.

New challenges:

- Discussing the generalization of the current models. Proposing models that are applicable worldwide and not limited to a local region.

- Accuracy to predict high levels of pollution. Focus on the models that are able to forecast pollution peaks.

We are interested in a large spectrum of manuscripts that includes original research papers, applied research case studies, and literature reviews. Our intention is to foment the debate and encourage the different experts in environmental engineering and artificial intelligence to share their views and defend their respective approaches. We are also excited in finding out if the big data breakthrough can initiate a paradigm shift in the scientific research on air quality.

Photo credits: Yves Rybarczyk.

Share

Editors

Yves Philippe Rybarczyk

Dalarna University

Rasa Zalakeviciute

University of the Americas

Impact

29,595 Total views

22,158 Article views

6,005 Article downloads

1,432 Topic views

Published In

Frontiers in Big Data

Data-driven Climate Sciences

Frontiers in Artificial Intelligence

AI in Food, Agriculture and Water

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Suggest a topic