ORIGINAL RESEARCH article
Testing and Validating Two Morphological Flare Predictors by Logistic Regression Machine Learning
- 1Department of Physics, Aberystwyth University, Ceredigion, United Kingdom
- 2Department of Astronomy, Eötvös Loránd University, Budapest, Hungary
- 3Gyula Bay Zoltán Solar Observatory (GSO), Hungarian Solar Physics Foundation (HSPF), Gyula, Hungary
- 4Solar Physics and Space Plasma Research Center (SP2RC), School of Mathematics and Statistics, University of Sheffield, Sheffield, United Kingdom
- 5Astrophysics Research Centre (ARC), School of Mathematics and Physics, Queen’s University, Belfast, United Kingdom
Whilst the most dynamic solar active regions (ARs) are known to flare frequently, predicting the occurrence of individual flares and their magnitude, is very much a developing field with strong potentials for machine learning applications. The present work is based on a method which is developed to define numerical measures of the mixed states of ARs with opposite polarities. The method yields compelling evidence for the assumed connection between the level of mixed states of a given AR and the level of the solar eruptive probability of this AR by employing two morphological parameters: 1) the separation parameter
A solar flare is a sudden flash observed in the solar atmosphere which is able to rapidly heat the plasma to megakelvin temperatures, while the electrons, protons and other heavier ions are accelerated to very large speeds (Benz, 2008). The associated accelerated particle clouds may reach the Earth, typically within a few hours or a day following a solar flare eruption. The flares produce radiation across the electromagnetic spectrum at all wavelengths. Most of the released energy is spread over frequencies outside the visible range. For this reason, the majority of flares must be observed with instruments which measurements in these wavelength ranges, as e.g., the Geostationary Operational Environmental Satellite (GOES). Therefore, the most generally known flare classification scheme is GOES flare-class. Measurements of the maximum x-ray flux at wavelengths from 0.1 to 0.8 nm near Earth are classed as A, B, C, M, or X type flares back from 19751. These five GOES flare intensity categories are further divided into a logarithmic scale labeled from 1 to 9. The A-, B- and C-classes are the lowest energy release classes of solar flares and they also occur frequently in the solar atmosphere. The A to C-class range has no or hardly any detectable effect on Earth based on current instrumentations and understanding. The M-class medium flare category may cause smaller or occasionally more serious disruptions, e.g., radio blackouts. However, the X-intensity flares may cause strong to extreme hazardous events, facility break-downs (e.g., radio blackouts, etc.) on the daylight side of the Earth (Hayes et al., 2017). The major solar flares (M- and X-class) are often accompany with accelerated solar energetic particles and coronal mass ejections (CMEs) (see, e.g.,, Tziotziou et al., 2010).
For solar activity modeling, a key ingredient is to determine the role of the associated observable magnetic field. Waldmeier (1938) proposed the first classification scheme to examine the connection between the size and morphology of active regions (ARs) and the capacity of their flare-productivity. This classification scheme is known today as the Zürich classification (see also Kiepenheuer, 1953). This scheme contains eight types thought to be representative of consecutive states in the evolution of a sunspot group. The classification system was further developed by McIntosh (1990). McIntosh introduced three more components based on characteristics including the Zürich class, the largest sunspot, and the sunspot distribution in an AR. Although the classification uses white-light observations only, it is still widely used.
The first magnetic classification scheme, known as the Mount Wilson classification, was introduced by Hale et al. (1919). It is simpler than the Zürich-McIntosh system, as it only distinguishes unipolar, bipolar, mixed configurations and very close and mixed configurations within a common penumbral feature, denoted by the letters α, β, γ and δ-class, respectively. Künzel (1960) added the δ-class configurations for the McIntosh system which refer to the most productive sources of energetic flares (see, e.g.,, Schrijver, 2016, and references therein). All these classification schemes are useful in revealing potential connections between the morphological properties of sunspot groups and their flare-productivity. However, it is somewhat ambiguous that these classification schemes rely on a number of rather subjective elements to be identified by visual inspection besides some more objective measures.
The McIntosh and Mount Wilson classifications have been shown to be useful for grouping ARs by their expected flare productivity (Gallagher et al., 2002; Ireland et al., 2008; Bloomfield et al., 2012). However, further quantities derived from AR observations allow a physical comparison and deeper understanding of the actual causes of the solar eruptions. In this sense, different morphological parameters have been introduced to characterised the magnetic field configuration or highlight the existence of polarity-inversion-lines (PILs) in ARs, with varying sophistication (see e.g., Barnes et al., 2016; Leka et al., 2018; Campi et al., 2019; Leka et al., 2019a, Leka et al., 2019b; Park et al., 2020, and references therein). Furthermore, Kontogiannis et al. (2018) investigated and tested some of those parameters, which were identified as efficient flare predictors. These parameters include, e.g.,, a quantity denoted as
The observed magnetic properties of an AR can be processed for the purpose of prediction by machine learning (ML) computational methods for data analysis (Camporeale, 2019), such as neural networks (Ahmed et al., 2013), support vector machines (Bobra and Couvidat, 2015; Boucheron et al., 2015), relevance vector machines (Al-Ghraibah et al., 2015), ordinal logistic regression (Song et al., 2009), decision trees (Yu et al., 2009), random forests (Liu et al., 2017; Domijan et al., 2019), and deep learning (Nishizuka et al., 2018). Notably, parameters
The content of the paper is as follows: Section 2 overviews in detail the two morphological parameters used for flare prediction in this work. Section 3 describes the data preparation process and key aspects of the adopted ML method. Section 4 shows the results of the analysis focusing on two morphological parameters in particular, while our conclusions are in Section 5.
2 Two Morphological Parameters
Korsós and Erdélyi (2016) introduced and tested, as a trial, an advantageous scheme that may be used as new prediction indicators besides the Zürich, McIntosh and Mount Wilson classification systems. This scheme includes two morphological parameters, namely:
• The separation parameter
where l and f refer to the leading and following polarities. The numerator denotes the distance between the area-weighted centers (therefore the index c) of the spots of leading and following polarities. Figure 1A gives a visual representation. The denominator is the diameter of a hypothetic circle (2 times the radius
The second introduced morphological parameter is the sum of the horizontal magnetic gradient
FIGURE 1. Figures illustrating the determination of the
Korsós and Erdélyi (2016) found that if
3 Data and Data Preparation
In this study, we further explore test and validate, the joint prediction capabilities of the
Four training sets were constructed to enforce consistency in time and test robustness, each one corresponding to 6-, 12-, 18- and 24-hr forecast issuing time interval, because within a day the forecast reliability becomes more pronounced. The study takes as a reference the time of the largest flare event for each AR. For each issuing time interval, we consider the calculated
Similarly to Korsós and Erdélyi (2016), this study uses information on around 1,000 ARs extracted from the Debrecen Sunspot Data Catalogue between 1996 and 2015 (Baranyi et al., 2016). The catalogue contains information including centroid position in various coordinate systems, area, and magnetic field of sunspots and sunspot groups. Derived from spacecraft observations, the catalogue has entries at each 1 hr for SDD5, and 1.5 hr for HMIDD6. The GOES7 flare catalogue is used for information on the largest-intensity flare eruption of each AR.
For each issuing time interval, two thirds of the ARs were randomly extracted to create a training set. These ARs are labeled as true(1) and false(0) events, under two different binary classification definition models:
• 1st model: When the largest intensity flare of an AR is M- or X-class then this case is classified as true(1), otherwise B- or C-class flares are false(0).
• 2nd model: Based on the results of Korsós and Erdélyi (2016), an event is true(1) if an AR is host to a M/X-class flare, satisfying
The two different classification models were chosen to study whether the two morphological parameters perform better, either with or without (2nd or 1st model) thresholds. Often, a well-chosen threshold adjustment(s) could improve prediction capabilities of a method, as a warning level or as a warning sign. Furthermore, in the case of both model approaches as described above, the set of
Solar flare prediction is affected by strong class imbalances, in that there are far more negative examples (labeled as N) than positive ones (labeled as P.) Therefore, we apply different metrics to measure the performance of the 1st and 2nd models. The performances of the two binary classifiers can be characterised by confusion matrixes in Figures 2, 3. Those confusion matrixes summarise the True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) predictions, we adopt different metrics to quantify the impact performance of the
• Accuracy is the ratio of true positives plus true negatives over all events, or how often the TRUE prediction is correct: (TP + TN)/(P + N)
• Recall, also called the true positive rate or sensitivity, measures the proportion of actual positives that are correctly identified: TP/P
• Specificity, also called the true negative rate, measures the proportion of actual negatives that are correctly identified: TN/N
• Precision, also called positive predictive value. This is the ratio of true positives over all positive predictions: TP/(TP + FP).
• Negative predictive value (NPV) is the ratio of true negatives over all negative predictions: TN/(TN + FN).
• F1 score is the harmonic mean between sensitivity (or recall) and precision (or). It tells us how precise our two classifiers are, as well as how robust these are. A greater F1 score means that the performance of our model is better. Mathematically, F1 can be expressed as: 2 (1/Recall + 1/Precision)
• True Skill Statistic (TSS) is widely used to test the performance of forecasts (McBride and Ebert, 2000). TSS will be the preferred performance metric when comparing results of the 1st and 2nd model approaches with different N/P ratios because this metric is independent from the imbalance ratio (Woodcock, 1976; Bloomfield et al., 2012). TSS takes into account both omission and commission errors. The TSS parameter is similar to Cohen’s kappa approach (Shao and Halpin, 1995), and compares the predictions against the result of random guesses. TSS ranges from −1 to +1, where +1 indicates perfect agreement. The zero or less value indicates that a performance no better than random (Landis and Koch, 1977). TSS = TP/P−FP/N = Recall + Specificity-1
FIGURE 2. The result of the binary logistic regression of the 1st model with 6-, 12-, 18-, and 24-hr forecast issuing times for panels (A), (B), (C), and (D) respectively. The right side of each panel presents the corresponding Receiver Operating Characteristic (ROC) curves.
FIGURE 3. Same as Figure 2, but in the case of the 2nd model.
TABLE 1. Flare prediction capabilities with six metrics in the case of the two model approaches i.e., for 1st model and 2nd model.
These seven metric parameters are plotted as a function of forecast issuing times in Figure 4, where the blue/red lines stand for the 1st/2nd model. Based on the values of Table 1 and Figure 4, the two models have high accuracy for all forecast issuing times. In both models, the best accuracy is gained by the 24-hr prediction window. We emphasise that the accuracy is a meaningful measure only if the values of FP and FN would be similar in the confusion matrices of Figures 2, 3. For dissimilar values, the other metrics must be considered in evaluating the prediction performance of the two models.
FIGURE 4. The evolution of selected metrics as a function of forecast issuing times for the 1st (blue) and 2nd (red) model.
Next, we focus on the recall and specificity metrics, which show the probability whether a model captures the correct classification during all four intervals. The values of the specificity metric show that the two models are capable to correctly classify TN cases during all four intervals, especially in the case of the 1st model, which is greater than 90%. Based on recall values, the TP classification of the 2nd model is 20% more accurate than the 1st model for 6/12/18/24-hr forecast issuing times.
However, when the two models classify a new AR, then we do not know the true outcome until after an event. Therefore, we are likely to be more interested in the question what is the probability of a true decision of the two models. This is measured by precision and NPV metrics. For the 1st model, the precision of the 24-hr prediction time is
The F1 and TSS metrics show that the 2nd model performs better than the 1st in the case of all of the prediction windows. This is an important aspect because the F1 and TSS are the most reliable scores in the presence of class imbalance. Intuitively, the F1 score is not as easy to understand as that of the accuracy, but it is usually more useful than accuracy, especially in our case, where we have an uneven class distribution. Namely, 77% of the F1 score shows that the 24-hr flare prediction window is the most efficient in the case of the 2nd model approach. Furthermore, the above 0.4 values of TSS score of the 2nd model show that this method is a good prediction scheme, and, the defined accuracy values of the 2nd model can be considered as correct.
We also use Receiver Operating Characteristic Curves (ROCs) to evaluated the results of the binary logistic regression method for both models. In the ROC plots in Figures 2, 3, the sensitivity (the proportion of true positive results) is shown on the y-axis, ranging from 0 to 1 (0–100%). The specificity (the proportion of false positive results) is plotted on the x-axis, also ranging from 0 to 1 (0–100%). The area under the curve (AUC) is a measure of the test’s performance at distinguishing positive and negative classes. In Figures 2, 3, AUCs are above 0.7, or a capability to distinguish between positive class and negative class with more than 70% likelihood over the 6-, 12-, 18- and 24-hr prediction time windows. From Figure 2, the 1st model shows similar AUC values during the four prediction windows. In the case of the 2nd model, the predicting probabilities are also similar based on the AUC values of Figure 3. On further note that the predicting probabilities of the 2nd model are 10% less than the 1st one, based on AUC values during the four prediction windows.
Korsós and Erdélyi (2016) introduced the separation parameter
In this work, a binary logistic regression machine learning approach is used to test and validate the flare prediction capability of the
• The morphological parameters give more than 70% flare prediction accuracy, based on logistical regression analysis. This result supports the findings of Kontogiannis et al. (2018) and (Campi et al., 2019), who conclude that the
• Based on the F1 scores and the True Skill Statistic metrics, the joint flare prediction efficiency of the
• The best flare prediction capability of the two parameters is available with 24-hr forecast issuing time. This latter means that the
• However, not just the 24 hrs prediction window has good metric scores, but also the ones with 6/12 and 18 hrs. This means that the
• The limitation of this study is that the applied data are extracted from a given sunspot database. Therefore, an other ML method (e.g., Convolutional Neural Network) that is trained on the same SDO/HMI intensity and magnetogram data, may assess further parameters to increase the predictive capability of the two morphological parameters.
We are aware that the two tested models are not perfect and so a natural question to ask is: how can we improve further them? In the future, we intend to further explore the application of these two warning parameters both from machine learning and physics perspectives: 1) fine tune the threshold conditions of 2nd model, 2) extend the application of the
Data Availability Statement
Publicly available datasets were analyzed in this study. This data can be found here: http://fenyi.solarobs.csfk.mta.hu/ftp/pub/SDO/data/. http://fenyi.solarobs.csfk.mta.hu/ftp/pub/SDD/data/
MK, RE, JL, and HM contributed to the conception and design of the study. MK performed the statistical analysis and wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.
MK and HM are grateful to the Science and Technology Facilities Council (STFC), (United Kingdom, Aberystwyth University, grant number ST/S000518/1), for the support received while carrying out this research. RE is grateful to STFC (United Kingdom, grant number ST/M000826/1) and EU H2020 (SOLARNET, grant number 158538). RE also acknowledges support from the Chinese Academy of Sciences President’s International Fellowship Initiative (PIFI, grant number 2019VMA0052) and The Royal Society (grant nr IE161153). JL acknowledges the support from STFC under grant No. ST/P000304/1.
Conflict of Interest
The authors are grateful to the Referees for constructive comments and recommendations which helped to improve the readability and quality of the paper.
Ahmed, O. W., Qahwaji, R., Colak, T., Dudok De Wit, T., and Ipson, S. (2010). A new technique for the calculation and 3D visualisation of magnetic complexities on solar satellite images. Vis. Comput. 26, 385–395. doi:10.1007/s00371-010-0418-1
Ahmed, O. W., Qahwaji, R., Colak, T., Higgins, P. A., Gallagher, P. T., and Bloomfield, D. S. (2013). Solar flare prediction using advanced feature extraction, machine learning, and feature selection. Sol. Phys. 283, 157–175. doi:10.1007/s11207-011-9896-1
Al-Ghraibah, A., Boucheron, L. E., and McAteer, R. T. J. (2015). An automated classification approach to ranking photospheric proxies of magnetic energy build-up. Astron. AstroPhys. 579, A64. doi:10.1051/0004-6361/201525978
Baranyi, T., Győri, L., and Ludmány, A. (2016). On-line tools for solar data compiled at the debrecen observatory and their extensions with the Greenwich sunspot data. Sol. Phys. 291, 3081–3102. doi:10.1007/s11207-016-0930-1
Barnes, G., Leka, K. D., Schrijver, C. J., Colak, T., Qahwaji, R., Ashamari, O. W., et al. (2016). A comparison of flare forecasting methods. I. Results from the all-clear workshop. Astrophys. J. 829, 89. doi:10.3847/0004-637X/829/2/89
Bloomfield, D. S., Higgins, P. A., McAteer, R. T. J., and Gallagher, P. T. (2012). Toward reliable benchmarking of solar flare forecasting methods. Astrophys. J. Lett. 747, L41. doi:10.1088/2041-8205/747/2/L41
Boucheron, L. E., Al-Ghraibah, A., and McAteer, R. T. J. (2015). Prediction of solar flare size and time-to-flare using support vector machine regression. Astrophys. J. 812, 51. doi:10.1088/0004-637X/812/1/51
Campi, C., Benvenuto, F., Massone, A. M., Bloomfield, D. S., Georgoulis, M. K., and Piana, M. (2019). Feature ranking of active region source properties in solar flare forecasting and the uncompromised stochasticity of flare occurrence. Astrophys. J. 883, 150. doi:10.3847/1538-4357/ab3c26
Domijan, K., Bloomfield, D. S., and Pitié, F. (2019). Solar flare forecasting from magnetic feature properties generated by the solar monitor active region tracker. Sol. Phys. 294, 6. doi:10.1007/s11207-018-1392-4
Hayes, L. A., Gallagher, P. T., McCauley, J., Dennis, B. R., Ireland, J., and Inglis, A. (2017). Pulsations in the Earth’s lower ionosphere synchronized with solar flare emission. J. Geophys. Res. 122, 9841–9847. doi:10.1002/2017JA024647
Ireland, J., Young, C. A., McAteer, R. T. J., Whelan, C., Hewett, R. J., and Gallagher, P. T. (2008). Multiresolution analysis of active region magnetic structure and its correlation with the Mount Wilson classification and flaring activity. Sol. Phys. 252, 121–137. doi:10.1007/s11207-008-9233-5
Kontogiannis, I., Georgoulis, M. K., Park, S.-H., and Guerra, J. A. (2017). Non-neutralized electric currents in solar active regions and flare productivity. Sol. Phys. 292, 159. doi:10.1007/s11207-017-1185-1
Kontogiannis, I., Georgoulis, M. K., Park, S.-H., and Guerra, J. A. (2018). Testing and improving a set of morphological predictors of flaring activity. Sol. Phys. 293, 96. doi:10.1007/s11207-018-1317-2
Korsos, M. B., Georgoulis, M. K., Gyenge, N., Bisoi, S. K., Yu, S., Poedts, S., et al. (2020). Solar flare prediction using magnetic field diagnostics above the photosphere. Astrophys. J. 896 (2), 119. doi:10.3847/1538-4357/ab8fa2
Leka, K. D., Barnes, G., and Wagner, E. (2018). The NWRA classification infrastructure: description and extension to the discriminant analysis flare forecasting system (DAFFS). J. Space Weather and Space Clim. 8, A25. doi:10.1051/swsc/2018004
Leka, K. D., Park, S.-H., Kusano, K., Andries, J., Barnes, G., Bingham, S., et al. (2019a). A comparison of flare forecasting methods. II. Benchmarks, metrics, and performance results for operational solar flare forecasting systems. Astrophys. J. 243, 36. doi:10.3847/1538-4365/ab2e12
Leka, K. D., Park, S.-H., Kusano, K., Andries, J., Barnes, G., Bingham, S., et al. (2019b). A comparison of flare forecasting methods. III. Systematic behaviors of operational solar flare forecasting systems. Astrophys. J. 881, 101. doi:10.3847/1538-4357/ab2e11
Liu, C., Deng, N., Wang, J., and Wang, H. (2017). “Predicting solar flares using SDO/HMI vector magnetic data product and random forest algorithm,” Astrophys. J. 843 (2), 104. doi:10.3847/1538-4357/aa789b
McBride, J. L., and Ebert, E. E. (2000). Verification of quantitative precipitation forecasts from operational numerical weather prediction models over Australia. Weather Forecast 15, 103–121. doi:10.1175/1520-0434(2000)015<0103:VOQPFF>2.0.CO;2
Park, S.-H., Leka, K. D., Kusano, K., Andries, J., Barnes, G., Bingham, S., et al. (2020). A comparison of flare forecasting methods. IV. Evaluating consecutive-day forecasting patterns. Astrophys. J. 890, 124. doi:10.3847/1538-4357/ab65f0
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830. doi:10.1016/j.patcog.2011.04.006
Schrijver, C. J. (2016). The nonpotentiality of coronae of solar active regions, the dynamics of the surface magnetic field, and the potential for large flares. Astrophys. J. 820, 103. doi:10.3847/0004-637X/820/2/103
Song, H., Tan, C., Jing, J., Wang, H., Yurchyshyn, V., and Abramenko, V. (2009). Statistical assessment of photospheric magnetic features in imminent solar flare predictions. Sol. Phys. 254, 101–125. doi:10.1007/s11207-008-9288-3
Tziotziou, K., Sandberg, I., Anastasiadis, A., Daglis, I. A., and Nieminen, P. (2010). Using a new set of space-borne particle monitors to investigate solar-terrestrial relations. Astron. AstroPhys. 514, A21. doi:10.1051/0004-6361/200912928
Keywords: morphological parameters, validation, binary logistic regression, machine learning, flare prediction
Citation: Korsós MB, Erdélyi R, Liu J and Morgan H (2021) Testing and Validating Two Morphological Flare Predictors by Logistic Regression Machine Learning. Front. Astron. Space Sci. 7:571186. doi: 10.3389/fspas.2020.571186
Received: 10 June 2020; Accepted: 11 December 2020;
Published: 18 January 2021.
Edited by:Peng-Fei Chen, Nanjing University, China
Reviewed by:Sergei Zharkov, University of Hull, United Kingdom
Keiji Hayashi, Stanford University, United States
Copyright © 2021 Korsós, Erdélyi, Liu and Morgan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: R. Erdélyi, email@example.com