Research Topic

Statistical Learning for Predicting Air Quality

About this Research Topic

The concentration of air pollutants is traditionally explained by complex physical and chemical processes of dispersion and advection. This is the reason why the prediction of air quality is usually addressed through deterministic models, such as Chemical Transport Models (CTMs).

However, the CTMs show several limitations and constraints. Their performance depends on an updated emission inventory of the urban area, which is often compromised in developing countries. They also struggle to make an accurate air pollution forecast in complex terrain regions. Finally, they require high computational power, in order to run time-consuming simulations.

More recently, statistical models based on Machine Learning (ML) algorithms have appeared as a valuable alternative to tackle many disadvantages of the CTMs. They seem particularly relevant to provide a fine resolution at an urban scale, where the estimation of air contamination is of the most importance for health concerns.

In that sense, ML could become the new paradigm for pollution forecasting. Nevertheless, ML has its own drawbacks, which may not transform the CTM into an obsolete technique. For instance, the accuracy of the ML-based models strongly relies on the volume of data. The bigger the dataset, the higher is the reliability of the prediction. Also, the downscaling procedure of the CTMs provides an additional level of explanation of the physical phenomena across multiple scales and resolutions, still not available with ML.

GOAL
An important goal of this Research Topic is to understand if ML can become the new standard for air quality prediction or if a hybrid modelling is the best approach, considering that it can take advantage of the complementarity between ML and CTM.

Among the several ML methods, we would like to identify which, if any, is the most suitable algorithm for atmospheric pollution forecasting. Such an assessment must consider all the dimensions of the prediction performance, which include both the accuracy and the interpretability of the models. For example, the non-linear models (e.g., ensemble learning or artificial neural networks) tend to be more accurate but less interpretable than a linear regression.

Advanced algorithms are extremely relevant, but they are just a part of the solution. Another key aspect is the quality of the selected features. This Research Topic intends to address the possible role of the new technologies (e.g., smartphone, internet of things, …) to provide unexplored features that could significantly impact the quality of the prediction, through a better spatial coverage of the pollution sources, for instance. A typical example is consuming road traffic data from web services, in order to account for motorized fleet emissions.

Data processing and fusion are also in the scope of this call. We welcome contributions that will provide an insight on the complementarity between widely spread low-cost sensors and/or proxy indicators and sparsely distributed high-resolution measurements from monitoring stations. This theme encompasses the data fusion between ground-level measurements and satellite images analysis (e.g., Aerosol Optical Depth).

Finally, this endeavour aims to define the upcoming priorities in terms of air quality forecasts, by considering regional differences between developed and developing countries regarding economic resources and the principal pollution sources. We will try to assess if the research carried out in the developed world is transferable to the less wealthy and the most polluted countries.

SCOPE
Based on the main goals previously mentioned, the specific topics covered by the Research Topic are listed below.


ML and CTM:

- Limitations of the ML approach.

- Complementarity between ML and CTM.

- Hybrid models based on an integration of CTM and ML.

Boosting the modelling:

- Advanced ML algorithms to predict pollution (e.g., deep learning).

- Data fusion.

- Original features.

Regional priorities:

- Affordable solutions for developing countries.

- Predicting the concentration of nanoparticles (e.g., PM1) in developed countries.

New challenges:

- Discussing the generalization of the current models. Proposing models that are applicable worldwide and not limited to a local region.

- Accuracy to predict high levels of pollution. Focus on the models that are able to forecast pollution peaks.

We are interested in a large spectrum of manuscripts that includes original research papers, applied research case studies, and literature reviews. Our intention is to foment the debate and encourage the different experts in environmental engineering and artificial intelligence to share their views and defend their respective approaches. We are also excited in finding out if the big data breakthrough can initiate a paradigm shift in the scientific research on air quality.

Photo credits: Yves Rybarczyk.


Keywords: machine learning, artificial neural networks, data-driven methods, hybrid models, atmosferic pollution


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

The concentration of air pollutants is traditionally explained by complex physical and chemical processes of dispersion and advection. This is the reason why the prediction of air quality is usually addressed through deterministic models, such as Chemical Transport Models (CTMs).

However, the CTMs show several limitations and constraints. Their performance depends on an updated emission inventory of the urban area, which is often compromised in developing countries. They also struggle to make an accurate air pollution forecast in complex terrain regions. Finally, they require high computational power, in order to run time-consuming simulations.

More recently, statistical models based on Machine Learning (ML) algorithms have appeared as a valuable alternative to tackle many disadvantages of the CTMs. They seem particularly relevant to provide a fine resolution at an urban scale, where the estimation of air contamination is of the most importance for health concerns.

In that sense, ML could become the new paradigm for pollution forecasting. Nevertheless, ML has its own drawbacks, which may not transform the CTM into an obsolete technique. For instance, the accuracy of the ML-based models strongly relies on the volume of data. The bigger the dataset, the higher is the reliability of the prediction. Also, the downscaling procedure of the CTMs provides an additional level of explanation of the physical phenomena across multiple scales and resolutions, still not available with ML.

GOAL
An important goal of this Research Topic is to understand if ML can become the new standard for air quality prediction or if a hybrid modelling is the best approach, considering that it can take advantage of the complementarity between ML and CTM.

Among the several ML methods, we would like to identify which, if any, is the most suitable algorithm for atmospheric pollution forecasting. Such an assessment must consider all the dimensions of the prediction performance, which include both the accuracy and the interpretability of the models. For example, the non-linear models (e.g., ensemble learning or artificial neural networks) tend to be more accurate but less interpretable than a linear regression.

Advanced algorithms are extremely relevant, but they are just a part of the solution. Another key aspect is the quality of the selected features. This Research Topic intends to address the possible role of the new technologies (e.g., smartphone, internet of things, …) to provide unexplored features that could significantly impact the quality of the prediction, through a better spatial coverage of the pollution sources, for instance. A typical example is consuming road traffic data from web services, in order to account for motorized fleet emissions.

Data processing and fusion are also in the scope of this call. We welcome contributions that will provide an insight on the complementarity between widely spread low-cost sensors and/or proxy indicators and sparsely distributed high-resolution measurements from monitoring stations. This theme encompasses the data fusion between ground-level measurements and satellite images analysis (e.g., Aerosol Optical Depth).

Finally, this endeavour aims to define the upcoming priorities in terms of air quality forecasts, by considering regional differences between developed and developing countries regarding economic resources and the principal pollution sources. We will try to assess if the research carried out in the developed world is transferable to the less wealthy and the most polluted countries.

SCOPE
Based on the main goals previously mentioned, the specific topics covered by the Research Topic are listed below.


ML and CTM:

- Limitations of the ML approach.

- Complementarity between ML and CTM.

- Hybrid models based on an integration of CTM and ML.

Boosting the modelling:

- Advanced ML algorithms to predict pollution (e.g., deep learning).

- Data fusion.

- Original features.

Regional priorities:

- Affordable solutions for developing countries.

- Predicting the concentration of nanoparticles (e.g., PM1) in developed countries.

New challenges:

- Discussing the generalization of the current models. Proposing models that are applicable worldwide and not limited to a local region.

- Accuracy to predict high levels of pollution. Focus on the models that are able to forecast pollution peaks.

We are interested in a large spectrum of manuscripts that includes original research papers, applied research case studies, and literature reviews. Our intention is to foment the debate and encourage the different experts in environmental engineering and artificial intelligence to share their views and defend their respective approaches. We are also excited in finding out if the big data breakthrough can initiate a paradigm shift in the scientific research on air quality.

Photo credits: Yves Rybarczyk.


Keywords: machine learning, artificial neural networks, data-driven methods, hybrid models, atmosferic pollution


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Topic Editors

Loading..

Submission Deadlines

19 July 2021 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..

Topic Editors

Loading..

Submission Deadlines

19 July 2021 Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..
Loading..

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..