Editorial: Machine learning and statistical methods for solar flare prediction

COPYRIGHT © 2023 Chen, Maloney, Camporeale, Huang and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. Editorial: Machine learning and statistical methods for solar flare prediction


Editorial on the Research Topic Machine learning and statistical methods for solar flare predictions
In recent years, the explosion in computing power and the amount of accessible data have resulted in a subsequent growth in applications of machine learning and statistical methods across many disciplines. The use of these methods in astronomy and space sciences has advanced both physical process modeling and data analysis. See Camporeale (2019) for a brief review of the challenges and opportunities of applying machine learning to space weather.
Among various space weather-relevant phenomena, solar flares, which are intense localized eruptions of electromagnetic radiation in the Sun's lower atmosphere, are a fundamental manifestation of solar explosive activity that researchers are interested in forecasting. Solar flare predictions are generally provided in occurrence probabilities of flares above M-or X-class within 24 or 48 h. The National Oceanic and Atmospheric Administration (NOAA) Research Topic near real-time solar flare data and resources. Flares are often accompanied by, though not always, coronal mass ejections (CMEs), which are large expulsions of plasma and magnetic field from the Sun's atmosphere. The CMEs affect power grids, telecommunication networks, and orbiting satellites. Solar energetic particles (SEPs) are high-energy, charged particles that originate in the solar atmosphere and solar wind. SEPs can originate either from a solar flare site or from shock waves associated with CMEs. See Whitman et al. (2022) and references therein for a comprehensive literature on forecasting of SEPs.
In particular, data analytics approaches using modern machine learning and statistical models are now being adopted in solar flare forecasting, aiming to enable early warning of strong solar flare events. Many articles have been published on this Research Topic over the past decade or so, for example, see Qahwaji and Colak (2007); Colak and Qahwaji, 2009;Huang et al., 2012;Ahmed et al., 2013;Bobra and Couvidat, 2015;Barnes et al., 2016;Huang et al., 2018;Florios et al., 2018;Leka et al., 2019a,Leka et al., 2019bLiu et al., 2019;Chen et al., 2019;Campi et al., 2019;Wang et al., 2020;Jiao et al., 2020;Cinto et al., 2020;Park et al., 2020;Sun et al., 2021;Nishizuka et al., 2021;Georgoulis et al., 2021;Sun et al., 2022;Liu et al., 2022 and references therein. Despite the demonstrated potential and success of adopting machine learning methods for solar flare forecasting, there are still many remaining Research Topic to be solved. The ultimate goal for the community of researchers will be to finally close the gap between scientific research, using either physics-driven or data analytics approaches and real time forecasting of strong space weather events. For solar flare prediction in particular, we recognize the adoption of machine learning approaches over the years, where: (i) complete black box models with no physics results in less interpretability, (ii) limited data from the past and relatively quiet solar cycles prohibit generalizations for the future trained model, and (iii) limited physics knowledge of the flaring mechanism leads to a less informative and partial list of important precursors.
The articles published in this Research Topic address a wide range of problems in solar flare forecasting, covering flare catalog, feature extraction, and CME arrival prediction. The methodologies range from regression models, deep neural networks, anomaly detection, and spatial Fourier transform to models of finite mixture. See below for a more detailed description of each article.
We, the editors, hope that this Research Topic of articles present readers with a wealth of modern methodologies and point out important and promising directions to delve into further. As a result of this Research Topic, we hope to see more innovative processing of various data products, novel methodologies, and new findings in the future on data driven approaches for solar flares and related events such as CMEs, monitoring, and forecasting.
Alobaid et al. in Predicting CME arrival time through data integration and ensemble learning, 363 geoeffective CMEs are collected from two solar cycles, #23 and #24, from 1996 to 2021. The authors use CME features, solar wind parameters, and CME images obtained from the SOHO/LASCO C2 coronagraph to predict the arrival time of these CMEs using an ensemble learning approach, named CMETNet.
Sande et al. in Solar flare catalog based on SDO/AIA EUV images: Composition and correlation with GOES/XRS X-ray flare magnitudes, a Solar Dynamics Observatory (SDO) Atmospheric Imaging Assembly (AIA)-based flare catalog, covering flares of GOES X-ray magnitudes C, M, and X from 2010 to 2017, is presented. An extremely randomized trees (ERT) regression model is used to map SDO/AIA flare magnitudes to GOES X-ray magnitude. The resulting catalog overlaps with 85% of M/X flares in the GOES flare catalog. A number of unrecorded or mislabeled large flares in the GOES catalog are also discovered. Wang et al. in Precursor identification for strong flares based on anomaly detection algorithm, strong flares correspond to "anomaly". The "normal" state is trained based on an unsupervised learning autoencoder network, whereas departures from the "normal" state are quantified by the differences between the observed and reconstructed pictures derived by the network. The results show promise for a long warning period of up to 2 days prior to strong flare events.
Guastavino et al. in Operational solar flare forecasting via videobased deep learning, it is shown that video-based deep learning, a combination of a convolutional neural network and a Long-Short Term Memory network, can be used for operational purposes. An algorithm that build up sets of active regions that are balanced according to the flare class rates associated to a specific cycle phase is presented; and this resulting data set is used for training and validating the video-based deep learning model.
Massa and Emslie in Efficient identification of pre-flare features in SDO/AIA images through use of spatial Fourier transforms, feature extraction or data compression of pre-flare SDO/AIA data is presented. This work is motivated by the potential of training Neural Networks using AIA data to identify features that lead to a solar flare, considering the extremely large data volume. Numerical experiments show that, not only do Fourier maps retain more information on the original AIA images compared to straightforward binning of spatial pixels, but also that certain types of changes in source structure (e.g., thinning or thickening of an elongated filamentary structure) are equally recognizable in the spatial frequency domain.
Aktukmak et al. in Incorporating Polar Field Data for Improved Solar Flare Prediction, data associated with the Sun's north and south polar field strengths are employed to improve solar flare prediction performance using machine learning models. As global information, the polar field data, when combined with local data from active regions on the photospheric magnetic field of the Sun, can help classify individual solar flares. This is manifested by the fact that the Heidke Skill Score improves by 10.1%. A novel probabilistic mixture of experts model is proposed, which can simply and effectively incorporate polar field data and provide on-par prediction performance with state-of-the-art solar flare prediction algorithms such as the Recurrent Neural Network (RNN).

Author contributions
YC drafted the manuscript, and other authors helped improving it.

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.