BRIEF RESEARCH REPORT article
Comparison of Deep Learning Techniques to Model Connections Between Solar Wind and Ground Magnetic Perturbations
- 1Department of Physics & Astronomy and Space Science Center, University of New Hampshire, Durham, NH, United States
- 2Department of Electrical and Computer Engineering, University of New Hampshire, Durham, NH, United States
- 3Department of Physics and Geophysical Institute, University of Alaska Fairbanks, Fairbanks, AL, United States
Geomagnetically induced currents (GIC) can drive power outages and damage power grid components while also affecting pipelines and train systems. Developing the ability to predict local GICs is important to protecting infrastructure and limiting the impact of geomagnetic storms on public safety and the economy. While GIC data is not readily available, variations in the magnetic field, dB/dt, measured by ground magnetometers can be used as a proxy for GICs. We are developing a set of neural networks to predict the east and north components of the magnetic field, BE and BN, from which the horizontal component, BH, and its variation in time, dBH/dt, are calculated. We apply two techniques for time series analysis to study the connection of solar wind and interplanetary magnetic field properties obtained from the OMNI dataset to the ground magnetic field perturbations. The analysis techniques include a feed-forward artificial neural network (ANN) and a long-short term memory (LSTM) neural network. Here we present a comparison of both models' performance when predicting the BH component of the Ottawa (OTT) ground magnetometer for the year 2011 and 2015 and then when attempting to reconstruct the time series of BH for two geomagnetic storms that occurred on 5 August 2011 and 17 March 2015.
Geomagnetically induced currents (GICs) are one of the most significant space weather effects due to their potential to damage the power grid and can cause widespread, long-term power outages. Thus, the ability to forecast GICs is of significant interest to the space weather community, industry partners, and national interests. The intensity of GICs is determined by the strength of the geoelectric field. However, neither measurements of GICs nor the geoelectric field are readily available. The geoelectric field is driven by temporal changes in the magnetic field and the local geology. Thus, measurements of dB/dt using ground magnetometers are used as a proxy for studying GICs. Ngwira et al. (2018) studied two storms during which intense dB/dt peaks occurred and indicated that substorms appear to be the driver of GICs, but state that it's not clear how the widespread features of substorms lead to localized peaks in dB/dt. They theorize that it could be due to “the mapping of magnetospheric currents to local ionospheric structures,” but indicate that further study is needed. Physics based models are used to determine magnetic field fluctuations, but high resolution models are needed to obtain the spatially localized variations (Welling et al., 2019). Such models are computationally expensive and take longer time to run, posing challenges for their use as a forecasting tool. Machine learning based models have the potential for providing efficient, computationally inexpensive forecasts. Wintoft et al. (2015) developed models using Elman neural networks to predict the 30-min maximum of dBH/dt (horizontal component of dB/dt) from ACE solar wind and magnetic field measurements. Their models generally predict the timing of GICs caused by sudden impulses well, even when they train the model using only ACE magnetic field measurements.
While many studies of GICs focus on high magnetic latitudes (>60°) that lie under the auroral oval, it has been shown that mid- (50°–60°) and low- (<50°) latitude regions are also at risk (Gaunt and Coetzee, 2007; Ngwira et al., 2008; Pulkkinen et al., 2010; Oliveira et al., 2018). Lotz and Cilliers (2015) developed a neural network based model using solar wind and IMF inputs and dB/dt measurements at a Southern hemisphere mid-latitude station as outputs. They developed separate models for the north and east components of the geomagnetic field and found that fluctuations in the eastward component are more dependent on the interplanetary magnetic field (IMF) Bz. Similar to Wintoft et al. (2015), they found reasonable predictions of the timing of intense fluctuations, with less accuracy as the storm evolved.
More complex neural network architectures can be used to improve the predictions for time-series data. For example, recurrent neural networks such as long short-term memory (LSTM) techniques are used to "remember" parameters from earlier times that have a strong influence on the output features. In this study we present a comparison of models using a feed-forward artificial neural network (ANN) with a built-in time dependence and a LSTM neural network to predict the ground magnetic field north and east components (and therefore the perturbations dBH/dt) at the mid-latitude ground magnetometer station located in Ottawa (OTT). We then discuss the performance of the models by using two of the benchmark geomagnetic storms suggested by Welling et al. (2018) and Pulkkinen et al. (2013). Finally, we discuss several model variations that were implemented during the course of this study to determine possible improvements to the performance of the models.
For this study, we use solar wind and interplanetary magnetic field (IMF) data obtained from the OMNIWeb dataset available through NASA's Space Physics Data Facility from 1995 through 2010 for the purpose of training and validation of the models, and from 2011 and 2015 for testing. These 2 years were selected for testing because they include storms from the Pulkkinen-Welling validation set for ground magnetic perturbations (Pulkkinen et al., 2013; Welling et al., 2018). Baseline-removed ground magnetometer data from OTT has been obtained from SuperMag (Gjerloev, 2012). The Ottawa ground magnetometer is located at magnetic latitude 54.98° N and lags UT by 5 h (meaning local midnight occurs at 05:00 UT). The choice of using solar wind data from OMNI instead of the more traditional and continuously available geomagnetic indices is based on the long-term goal of being able to forecast variations in the ground magnetic field ahead of time, and real time solar wind parameters obtained at the L1 position tend to give a 30–40 min window to distribute a warning. However, the OMNI dataset has been mostly avoided in the past as it contains approximately 20% of missing data distributed roughly evenly through the years in the plasma parameters and ~ 8% in the IMF measurements. This was noted by Wintoft et al. (2015), leading them to compare models trained on just the magnetic field vs. combined magnetic field and plasma measurements. Our intention is to use both IMF and plasma parameters. Since a relatively continuous dataset is preferred for training, some linear interpolation has been done in the training/validation dataset of up to 10 min in all missing parameters, which reduces the missing values to ~ 6% in both IMF and plasma measurements. OTT ground magnetic field measurements have less than 1% of missing data during the period of study. Those missing data points have been removed from the training set. For the testing periods of 2011 and 2015 a full linear interpolation has been performed to the solar wind data to achieve a completely continuous dataset.
Although dBH/dt is the best proxy measurement to GIC forecasting, it is also very noisy and therefore difficult to forecast directly with data-driven models. Tóth et al. (2014) also found this to be true for first principles-based models. We therefore aim to first predict the northward and eastward components of the baseline-removed ground magnetic field, BN and BE, respectively, using two independent models and then combine them to obtain the predicted horizontal component
For the purpose of comparison with the metrics defined by Pulkkinen et al. (2013) we also need to obtain (dB/dt)H, which is calculated as
t is determined by the ground data resolution (1 min), and i represents the components N or E. We emphasize that large variations are most likely to result in significant GIC events.
3.1. Feed-Forward Artificial Neural Network
In our first attempt to forecast the BE and BN components we have chosen to train a fully-connected, feed-forward, artificial neural network (ANN) developed using the TensorFlow-Keras environment (Abadi et al., 2015). The Tensorflow-Keras environment is highly modular and allows the easy integration of data and different types of networks. It also allows integration and computation through GPUs instead of the traditional CPU computing which reduces significantly (in our case, up to a factor of 10) the training times. For the network architecture we have chosen a 4-layer deep network with hidden layers of 291-146-73-36 neurons plus a dropout layer (rate 0.1) in between the first and second layer to avoid overfitting. Such selection of neurons matches the dimensionality of the feature vectors and then each layer is halved. We have selected mean square error (MSE) as our loss function, ADAM as our optimizer, and a REctified Linear Unit (RELU) as our activation function. We track the loss in the validation dataset and stop the training after the MSE has not decreased for 25 epochs to avoid overfitting. To incorporate time dependence in the ANN, the input vector for t includes features from previous time steps, e.g., t − 1, t − 2. We have chosen to include a 2-h time-history for solar wind speed (VT, Vx, Vy, Vz), IMF (BT, Bx, By, Bz), proton density, dynamic pressure, temperature and solar wind electric field using a 1-min cadence for the first 12 preceding minutes (i.e., up to t − 12) plus 10-min averages over the entire interval (yielding 12 additional values). Additionally, ground magnetometer sin(MLT) and cos(MLT) values have been included to ensure a cyclical dependence over the Earth's rotation and solar zenith angle as a proxy of both longitude and yearly seasonality. The resulting feature vector thus contains 291 features and that explains our choice of neurons in the first layer. OMNI data from 1995 to 2010 was split sequentially 70% for training and 30% for validation. The model was then trained and the result used on the test data from 2011 to 2015.
Figure 1 (top row) presents a density plot of the log10(real) vs. log10(predicted) BH values calculated using the validation and test results obtained for BE and BN and Equation ( 1). The ANN performs a relatively good job at predicting values for the validation set with a root mean square error (RMSE) of ~8.7 nT and an explained variance of 39%, resulting in a correlation coefficient of 0.61. For the test cases, the respective RMSE values are ~ 9.2 nT with explained variance of 36% and correlation coefficient 0.60 for the year 2011 and an RMSE of ~14.7 nT and explained variance of 44% with a correlation of 0.66 for the year 2015. While the predictions could be improved, the consistency of the values of Figure 1 indicate that the model is not overfitting. The correlation coefficient values are a bit lower to those obtained by Lotz et al. (2017) of 0.71 and 0.69 for models predicting separate components of the horizontal magnetic field at a mid-latitude station. Wintoft et al. (2015) obtain much higher correlation coefficients, although they are only considering the maximum value within each 30-min window. The slightly worse RMSE values obtained for the year 2015 could be due to the higher variability of a year in the solar maximum. We have also plotted the error distributions for the predictions of the validation and 2011 and 2015 test sets (See Supplementary Material). The similarity in the error distributions is also an indication of a model that is not overfitting, consistent with Figure 1.
Figure 1. Density plots of real vs. predicted values of BH for the ANN (top) and the LSTM (bottom). The comparisons of actual measurements to model predictions using validation data (left column) and test data from 2011 (middle column) and 2015 (right column) are shown. The correlation coefficient, R, is given in each panel.
3.2. Long Short Term Memory
The long short-term memory (LSTM) (Hochreiter and Schmidhuber, 1997) recurrent neural network was developed using TensorFlow-Keras (Abadi et al., 2015). The input features used in the LSTM model were the same as were used for the ANN as was the training-validation split of the 1995-2010 OMNI data. The LSTM model requires an extra dimension in the input vector to build in the time history. To keep the models consistent with each other, the LSTM model also uses the preceding 12-min plus 10-min averages over the preceding 2 h time history. Memory limitations required the use of a custom data generator which fed batch sizes of 512 training samples sequentially into the model for training. The network consists of a single LSTM layer with 147 neurons and a single dense layer with one neuron. A RELU is used as the activation function, the optimizer is ADAM, and the loss function MSE. As with the ANN, validation loss was monitored and training was stopped after the MSE had not decreased in 25 epochs. Although recurrent neural networks, and in particular LSTM, are capable of utilizing the time history of the target parameter (in this case, BN or BE) to improve the prediction, in this study we have chosen not to use it in order to obtain a closer comparison with the ANN model. Once the model was trained, it was tested on data from 2011 and 2015.
The real vs. predicted values of BH for the validation and test datasets are shown as a density plot in Figure 1 (bottom row). The RMSE of the validation set is 8.7 nT with a 33% explained variance, 9.7 nT for the 2011 dataset with a 35% explained variance and 16.8 nT for the 2015 dataset with a 36% explained variance. The consistency of the correlation coefficients (shown in Figure 1) indicates that the model did not suffer from overfitting, which could have been a concern with the amount of input data. Considering these parameters, the LSTM model seems to under-perform the ANN. Similarly to the ANN case, the error distribution for the predictions of the LSTM model further indicate that the model is not severely overfitting (see Supplementary Material).
The two types of models we trained are used to predict BN and BE during storms that occurred on 5 August 2011 and 17 March 2015. The selection was based on the recommendations from the Pulkkinen-Welling validation set for ground magnetic perturbations (Pulkkinen et al., 2013; Welling et al., 2018). These two storms were selected because they are outside of the time range used to train and validate the models and because they correspond to two very different years in terms of geomagnetic activity, 2011 being on the minimum-ascending part of the solar cycle, and 2015 corresponding to the solar cycle maximum. The predicted values are used to calculate BH (Equation 1) and dBH/dt (Equation 2), and both values are compared to the real measurements. The results for a third storm from the validation set, 17 March 2013, are shown in the Supplementary Material.
4.1. August 5 2011 Storm
Figure 2 shows the temporal evolution of the 5 August 2011 geomagnetic storm,including SYM-H index, solar wind speed Vx, IMF magnetic field z-component, along with the measured BH and dBH/dt at Ottawa station, and the predicted BH and dBH/dt from the ANN and LSTM models. This is a strong storm with a minimum SYM-H value of –126 nT, driven by a combination of a coronal mass ejection and high speed stream, with the shock arriving at Earth leading to a sudden storm commencement around 18 UT on August 5. In this event, BH presents a clear response to the storm that shows as repeated dBH/dt variations of up to ~50 nT/min occurring for several hours during the storm main phase, following the period of quick increase in solar wind speed and mostly during the time in which the IMF Bz component is strongly and persistently southward.
Figure 2. August 5, 2011 storm measurements and predictions, including Sym-H index, solar wind velocity, interplanetary magnetic field z-component, measured (blue solid) and predicted (red dashed) horizontal magnetic field and time-dependent variation at OTT ground station for the ANN and LSTM models.
In terms of reconstructing the BH evolution, neither model does a good job of predicting the first two enhancements. The ANN does a decent job predicting the third and fourth enhancements. After ~23:00 UT on August 5, the ANN predicts some increases but does not match the magnitude likely due to these being later in the storm when the magnetic field perturbations are controlled more by parameters within the magnetosphere or ionosphere, although this is also a period in which the actual fluctuations in BH are relatively small. The LSTM doesn't do well predicting the enhancements during this storm in general. Considering the dBH/dt, the ANN comes close to predicting the timing of the biggest spike in dBH/dt at ~20:00 UT on August 5, at about half the magnitude, while the LSTM misses this completely.
4.2. March 17 2015 Storm
Figure 3 shows the evolution of the 17 March 2015 geomagnetic storm using the same format as Figure 2. This was the largest geomagnetic storm of solar cycle 24, with minimum SYM-H index of –234 nT. Carter et al. (2016) analyzed the ground magnetic perturbations during this storm, showing that the mid- and low-latitude fluctuations predominantly occurred at the sudden storm commencement. This indicates that the solar wind parameters are most important for predicting GICs at mid-latitude.
Figure 3. March 17, 2015 storm measurements and predictions, including SYM-H index, solar wind velocity, interplanetary magnetic field z-component, measured (blue solid) and predicted (red dashed) horizontal magnetic field and time-dependent variation at OTT ground station for the ANN and LSTM models.
Both models miss the initial spike at the sudden storm commencement just before 5:00 UT, and start to predict enhanced BH about 2–3 h later. Both models predict some enhancement near 8:00 UT, though matching the real enhancement is unlikely due to this being an interval of linear interpolation of the input data (as seen in the straight line in the solar wind velocity and IMF Bz over 7:00–9:00 UT). This linear interpolation results in the ANN overpredicting the magnitude and the LSTM predicts significant spikes. The LSTM does a better job of predicting the enhancement just after 12:00 UT, even getting close in overall magnitude of BH. Again, the linearly interpolated input data (~15:00–17:00 UT) reduces the ability to accurately predict the second half of this enhancement. Only the ANN predicts the largest enhancement at ~21:00 UT, but does not match the shape of the peak or timing.
It is important to note that in this particular storm, there are some gaps in the solar wind measurements, particularly between 06–08 UT and then on 16–18 UT on March 17. In order to generate a prediction, the solar wind data has been linearly interpolated. Despite the missing data, a simple linear interpolation allows the models to recover part of the variability of the ground magnetic field fluctuations, suggesting that efforts on gap-filling, including empirical modeling of the solar wind data could still yield positive results in the risk assessment of GICs.
4.3. Validation Metrics
The Pulkkinen-Welling recommendations include four metrics for validating predictions of ground magnetic perturbations (Pulkkinen et al., 2013; Welling et al., 2018) using binary event analysis. For a particular storm, the interval of interest is divided into non-overlapping, 20-min windows, and for each window, a value is calculated as the maximum measured dBH/dt within the window. In addition to the method of calculating (dB/dt)H using Equation (2), we use the power law empirical fitting for the OTT station described by Tóth et al. (2014),
and present both results for comparison. Pulkkinen-Welling propose four different thresholds of 0.3, 0.7, 1.1, and 1.5 nT/s to evaluate if the model is able to predict a variation of that magnitude. Since the thresholds are in nT/s, and our calculated dBH/dt is in nT/min, we multiply the thresholds by 60. Using the values of the predicted , we determine whether the model accurately predicts the events by calculating true positives, false positives, true negatives, and false negatives. Accuracy is determined by calculating the following metrics: probability of detection (POD), probability of false detection (PFD), proportion correct (PC), and Heidki Skill Score (HSS). Table 1 displays the metrics for the ANN and LSTM for the 2011 and 2015 storms. The missing values are due to no occurrences of the real and predicted values crossing the higher thresholds. The metrics for the 2013 storm are also shown in the Supplementary Material, and although they are a bit better than for the 2011 and 2015 storms, it suffers from having very few thresholds crossings beyond 0.3 nT/s for the Ottawa station.
Table 1. Validation metrics for the 05 August 2011 and the 17 March 2015 geomagnetic storms using dBH/dt maximum values every 20 min calculated from the Pulkkinen definition (Equation 2) and the Tóth empirical fitting (Equation 3).
The poor prediction of the LSTM for the 2011 storm is evident in the low POD and HSS values. The low PFD (lower is better) and high PC are indicative of the low numbers of real crossings of the threshold levels. The Tóth et al. (2014) empirical fitting method results in improved metrics for these models, as they found for a first principles-based model. In fact, the ANN has a similar HSS for the 0.3 nT/s threshold as they report for mid-latitude stations (0.583). However, our methods have a lot of potential for improvement. We had originally trained and optimized a single model of each type that predicted BH, rather than independent models for BN and BE. The single models of each type had better correlation coefficients, explained variance, and RMSE than the values discussed in section 3, but much lower validation metrics than those shown in Table 1. (We note that predicting only BH requires calculation of dBH/dt directly, rather than being able to use Equation 2, such that the metrics do not have a one-to-one comparison. We implemented separate modeling of the north and east components for more direct comparison with other models since the Pulkkinen et al. (2013) method is widely accepted.) Using this single model method, we also trained models that use only 24 min of time history but all at a 1-min cadence. The use of the 10-min averages produced higher explained variance scores of BH. However, this resulted in a smoothing of the BH prediction. This smoothing causes less variation in the dBH/dt predictions and decreases the HSS. Additionally, the LSTM performs poorly, likely due to the fact that we did not use the time history of the target parameter, despite that being the strength of the LSTM.
We have developed and compared two types of models that predict the north and east components of the ground magnetic field, BN and BE, at a single mid-latitude ground station. One model is a feed-forward artificial neural network that includes time dependence as input features and the other is a long short-term memory neural network. The predictions from each model are compared to real measurements for 2 years, 2011 and 2015, including a storm during each year. There is some ability for each of the models to predict the timing of magnetic field perturbations, though this ability is not consistently better for either model between the storms and neither is able to predict the magnitude of the enhancements or predict enhancements later in the storm. Validation metrics indicate that the LSTM is barely more skilled than random or constant predictions, and that using an empirical fitting improves HSS as it does for first principles-based models. Next steps to improve the models include adjustments of the input parameters, increased time history cadence, and comparison to additional time series techniques. Another limitation of these models is the use of only one ground magnetometer station. We expect better predictions if we include more ground stations in the mid-latitude range to get more MLT coverage.
Data Availability Statement
AK was the primary author of the paper, provided guidance on model development and analysis, and participated in interpretation. VP conducted data preparation, model development, analysis, and interpretation and assisted with writing. MC conducted model development and analysis and assisted with writing. MM provided guidance on data preparation and model development. CL assisted with data preparation. HC assisted with overall project discussions and provided editorial comments. All authors contributed to the article and approved the submitted version.
This work was supported by NSF EPSCoR Award OIA-1920965.
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We thank members of the project team at UNH and UAF and Chigo Ngwira that participated in discussions for this work. We thank OMNIWeb and SuperMag for providing the data.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fspas.2020.550874/full#supplementary-material
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al. (2015). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org.
Carter, B. A., Yizengaw, E., Pradipta, R., Weygand, J. M., Piersanti, M., Pulkkinen, A., et al. (2016). Geomagnetically induced currents around the world during the 17 March 2015 storm. JGR Space Phys. 121, 10496–10507. doi: 10.1002/2016JA023344
Ngwira, C. M., Pulkkinen, A., McKinnell, L. A., and Cilliers, P. J. (2008). Improved modeling of geomagnetically induced currents in the South African power network. Space Weather 6, 1–8. doi: 10.1029/2008SW000408
Ngwira, C. M., Sibeck, D., Silveira, M. V. D., Georgiou, M., Weygand, J. M., Nishimura, Y., et al. (2018). A study of intense local d B /d t variations during two geomagnetic storms . Space Weather 16, 676–693. doi: 10.1029/2018SW001911
Oliveira, D. M., Arel, D., Raeder, J., Zesta, E., Ngwira, C. M., Carter, B. A., et al. (2018). Geomagnetically induced currents caused by interplanetary shocks with different impact angles and speeds. Space Weather 16, 636–647. doi: 10.1029/2018SW001880
Pulkkinen, A., Rastätter, L., Kuznetsova, M., Singer, H., Balch, C., Weimer, D., et al. (2013). Community-wide validation of geospace model ground magnetic field perturbation predictions to support model transition to operations. Space Weather 11, 369–385. doi: 10.1002/swe.20056
Welling, D. T., Dimmock, A. P., L, R., Morley, S., and Yordanova, E. (2019). “Resolving small scale gic effects: what is our capability?,” in Abstract SH32B-05 Presented at 2019 Fall Meeting (San Francisco, CA:AGU).
Welling, D. T., Ngwira, C. M., Opgenoorth, H., Haiducek, J. D., Savani, N. P., Morley, S. K., et al. (2018). Recommendations for next-generation ground magnetic perturbation validation. Space Weather 16, 1912–1920. doi: 10.1029/2018SW002064
Keywords: space weather, GIC, geomagnetic storms, ground magnetic field, machine learning, neural network, LSTM
Citation: Keesee AM, Pinto V, Coughlan M, Lennox C, Mahmud MS and Connor HK (2020) Comparison of Deep Learning Techniques to Model Connections Between Solar Wind and Ground Magnetic Perturbations. Front. Astron. Space Sci. 7:550874. doi: 10.3389/fspas.2020.550874
Received: 10 April 2020; Accepted: 28 August 2020;
Published: 06 October 2020.
Edited by:Enrico Camporeale, University of Colorado Boulder, United States
Reviewed by:Gabor Toth, University of Michigan, United States
Christian L. Vásconez, National Polytechnic School, Ecuador
Copyright © 2020 Keesee, Pinto, Coughlan, Lennox, Mahmud and Connor. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Amy M. Keesee, firstname.lastname@example.org