Probabilistic modelling of substorm occurrences with an echo state network

Nakano, Shin'ya; Kataoka, Ryuho; Nosé, Masahito; Gjerloev, Jesper W.

doi:https://doi.org/10.5194/angeo-41-529-2023

Articles | Volume 41, issue 2

https://doi.org/10.5194/angeo-41-529-2023

Articles | Volume 41, issue 2

Regular paper

21 Nov 2023

Regular paper |

| 21 Nov 2023

Probabilistic modelling of substorm occurrences with an echo state network

Shin'ya Nakano, Ryuho Kataoka, Masahito Nosé, and Jesper W. Gjerloev

Abstract

The relationship between solar-wind conditions and substorm activity is modelled with an approach based on an echo state network. Substorms are a fundamental physical phenomenon in the magnetosphere–ionosphere system, but the deterministic prediction of substorm onset is very difficult because the physical processes that underlie substorm occurrences are complex. To model the relationship between substorm activity and solar-wind conditions, we treat substorm onset as a stochastic phenomenon and represent the stochastic occurrences of substorms with a non-stationary Poisson process. The occurrence rate of substorms is then described with an echo state network model. We apply this approach to two kinds of substorm onset proxies. One is a sequence of substorm onsets identified from auroral electrojet intensity, and the other is onset events identified from activity of Pi2 pulsations, which are irregular geomagnetic oscillations often associated with substorm onsets. We then analyse the response of substorm activity to solar-wind conditions by feeding synthetic solar-wind data into the echo state network. The results indicate that the effect of the solar-wind speed is important, especially for Pi2 substorms. A Pi2 pulsation can often occur even if the interplanetary magnetic field (IMF) is northward, while the activity of auroral electrojets is depressed during northward IMF conditions. We also observe spiky enhancements in the occurrence rate of substorms when the solar-wind density abruptly increases, which might suggest an external triggering due to a sudden impulse of solar-wind dynamic pressure. It seems that northward turning of the IMF also contributes to substorm occurrences, though the effect is likely to be minor.

Download & links

Article (PDF, 2114 KB)

Download & links

How to cite.

Received: 22 Mar 2023 – Discussion started: 22 Mar 2023 – Revised: 23 Aug 2023 – Accepted: 11 Oct 2023 – Published: 21 Nov 2023

1 Introduction

Substorms are a fundamental physical phenomenon in the magnetosphere–ionosphere system. Since substorms are the main source of geomagnetic disturbances in the polar ionosphere causing geomagnetically induced currents (e.g. Viljanen et al., 2006; Wei et al., 2021; Schillings et al., 2022), prediction of substorms is an important issue. Substorms are most likely driven by the solar wind, so it is essential to understand the relationship between substorms and solar-wind conditions for predicting substorms. In this context, many studies have attempted to construct predictive models of the auroral electrojet indices, AU and AL, which represent the intensities of auroral electrojets in the polar ionosphere (Davis and Sugiura, 1966). For example, Luo et al. (2013) developed a parametric model for predicting the AU and AL indices from a time history of solar-wind data. There have also been a number of studies which employed machine learning approaches for predicting the AU and AL indices from given solar-wind conditions (e.g. Gleisner and Lundstedy, 1997; Takalo and Timonen, 1997; Amariutei and Ganushkina, 2012). Our previous study (Nakano and Kataoka, 2022) (hereinafter referred to as NK22) also modelled the relationship between the AU and AL indices and solar-wind variables using an echo state network (ESN) model, which is a kind of recurrent neural network originally introduced by Jaeger and Haas (2004).

However, a good prediction of the AU and AL indices does not necessarily guarantee successful prediction of substorm onsets. Many studies have argued that substorm onsets are triggered by internal processes of the magnetosphere–ionosphere system (e.g. Baker et al., 1996; Lui, 1996; Lyons et al., 2018; Miyashita and Ieda, 2018). Considering that substorms sometimes take place without any visible solar-wind variations (e.g. Nakano and Iyemori, 2005), substorms may be caused by complex magnetospheric processes which are hard to predict. It would thus be difficult to deterministically predict individual substorm onsets from a given time history of solar-wind conditions. As a matter of fact, the ESN in our previous study, NK22, did not always predict sharp decreases of the AL index due to a substorm expansion. Figure 1 compares the observed Kyoto AU and AL indices (World Data Center for Geomagnetism, Kyoto et al., 2015) and the prediction with the ESN. Although the prediction shown by the red line roughly traces the actual AU and AL indices, it did not reproduce many negative AL spikes.

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f01

Figure 1Prediction of AU and AL indices with the ESN in NK22 (red) and actual AU and AL indices calculated by the World Data Center, Kyoto (WDC Kyoto) (gray).

Download

In this study, we propose another approach for modelling the relationship between substorm occurrences and solar-wind input. To deal with the complexity underlying substorm occurrences, we treat a substorm onset as a stochastic phenomenon. The stochastic occurrence of substorms is represented with a non-stationary Poisson process, which is a probabilistic model describing event time series (e.g. Daley and Vere-Jones, 2003). The occurrence rate of substorms is then modelled with an echo state network (ESN) model. By introducing the ESN, we can represent the dependence of the occurrence rate on solar-wind conditions. Probabilistic prediction of substorm onsets was also performed by Maimaiti et al. (2019), who predicted a substorm occurrence for next 60 min from the time history of solar-wind data. In contrast, the purpose of this study is to model the response of substorm activity to given solar-wind condition. Our approach thus sequentially processes the time series of solar-wind data to obtain the instantaneous occurrence rate of onsets.

We applied the proposed ESN-based approach to two types of substorm onsets. One is a sequence of substorm onsets identified from the SML index (Newell and Gjerloev, 2011 a, b). The SML index is a proxy of westward auroral electrojet intensity derived from the SuperMAG geomagnetic data (Gjerloev, 2012) with the same algorithm as for the AL index. However, since the events determined from the auroral electrojet intensity may contain non-substorm events such as DP2-type convection enhancements (e.g. Nishida, 1968; Kamide and Kokubun, 1996), an increase of the auroral electrojet would not necessarily indicate a substorm onset. To identify substorm onsets without relying on the auroral electrojet intensity, we also analyse a sequence of substorm onsets identified from the Wp index (Nosé et al., 2009, 2012; World Data Center for Geomagnetism, Kyoto and Nose, 2016). The Wp index corresponds to the amplitude of low-latitude Pi2 pulsations. An onset determined from the Wp index is thus regarded as a Pi2 event. After training the ESN to model the two types of substorms, the relationship between substorm activity and solar-wind conditions is discussed by analysing the response of the ESN outputs to synthetic solar-wind inputs.

2 Method

We denote ν(t|β)dt as the probability of the occurrence of an event within a small time interval dt, where β is a parameter determining the shape of the function ν. The function ν is referred to as the intensity function, and it corresponds to the instantaneous occurrence rate per unit time. Given a sequence of event occurrence times $τ_{1 : N} = {τ_{1}, τ_{2}, \dots, τ_{N}}$ , the likelihood of the parameter β is written as follows (e.g. Daley and Vere-Jones, 2003):

\begin{matrix} (1) & L (β) = p (τ_{1 : N} | β) = \prod_{i = 1}^{N} ν (τ_{i} | β) \exp [- \int_{t_{0}}^{t_{K}} ν (t | β) d t], \end{matrix}

where p denotes probability density, t denotes time, and N is the number of events. The log-likelihood thus becomes

\begin{matrix} (2) & \log L (β) = \sum_{i = 1}^{N} \log ν (τ_{i} | β) - \int_{t_{0}}^{t_{K}} ν (t | β) d t . \end{matrix}

Discretising Eq. (2) in time, we obtain

\begin{matrix} (3) & \log L (β) = \sum_{i = 1}^{N} \log ν (τ_{i} | β) - \sum_{k = 0}^{K - 1} ν (t_{k} | β) Δ t, \end{matrix}

where $t_{k} = t_{0} + k Δ t$ . The time interval Δt in Eq. (3) is set to be 5 min in this study. If an event occurs at t=τ_i during large ν, the first term on the right-hand side of Eq. (2) gets larger. On the other hand, the second term on the right-hand side of Eq. (2) becomes a large negative number if ν keeps large values for the whole interval. The log-likelihood, log L, thus becomes larger by choosing an intensity function which becomes large when events frequently occur and small when events do not occur. We want to obtain a good intensity function which describes the occurrence rate of events better.

We model the function ν with the ESN (Jaeger and Haas, 2004), used in NK22 and another recent study (Kataoka and Nakano, 2021). Denoting the vector consisting of the state variables of the ESN at time t_k as x_k, the ith element of x_k, x_k,i is updated at each time step according to the following equation:

\begin{matrix} (4) & x_{k, i} = \tanh (w_{i}^{T} x_{k - 1} + u_{i}^{T} z_{k} + η_{i}), (i = 1, \dots, m), \end{matrix}

where z_k is the vector of the input variables, w_i is the vector of the weights connecting the state variables, u_i is the vector of the weights connecting the input variables and state variables, and m denotes the dimension of the vector x_k. The dimension m is set to be 1000 in this study. The input vector z_k is given as follows:

\begin{matrix} (5) & z_{k} = (\begin{array}{c} B_{x, k} / S_{B_{x}} \\ B_{y, k} / S_{B_{y}} \\ B_{z, k} / S_{B_{z}} \\ (V_{sw, k} - b_{V}) / S_{V} \\ (N_{sw, k} - b_{N}) / S_{N} \\ (Θ_{sw, k} - b_{Θ}) / S_{Θ} \\ \cos (2 π H_{k} / 24) \\ \sin (2 π H_{k} / 24) \\ \cos (2 π D_{k} / 364.24) \\ \sin (2 π D_{k} / 364.24) \end{array}), \end{matrix}

where B_z,k, B_y,k, and B_x,k respectively denote the z, y, and x components of the interplanetary magnetic field in the geocentric solar magnetospheric (GSM) coordinates at time t_k; V_sw,k is the x component of the solar-wind velocity in the GSM coordinates; and N_sw,k and Θ_sw,k are the solar-wind density and temperature, respectively. These solar-wind variables are taken from the OMNI 5 min data (King and Papitashvili, 2023). H_k and D_k respectively indicate universal time (UT) in hours and the day from the end of 2000 (i.e. D_k=1 on 1 January 2001) for determining the UT dependence and seasonal dependence (e.g. Cliver et al., 2000). $S_{B_{x}}$ , $S_{B_{y}}$ , $S_{B_{z}}$ , S_V, S_N, and S_T are rescaling factors to adjust the value of each element of z_k to a similar range, and b_V, b_N, and b_T are for adjusting the range of each element of z_k. According to NK22, we assume $S_{B_{x}} = S_{B_{y}} = S_{B_{z}} = 10 nT$ , $S_{V} = 500 km s^{- 1}$ , S_N=20 cm⁻³, S_T=10⁶ K, $b_{V} = 400 km s^{- 1}$ , b_N=1 cm⁻³, and $b_{T} = 2 \times 10^{5} K$ . The weights {w_i} and {u_i} are randomly given in advance and are fixed. The parameters {η_i} are also randomly given and are fixed. Specifically, we randomly chose 90 % of the weights {w_i} and {u_i} and set them to 0. The non-zero elements of u_i and η_i were given randomly from a standard normal distribution. The non-zero elements of w_i were also drawn from a normal distribution. The weights {w_i} were then rescaled such that the maximum singular value of the weight matrix is 0.99, where the weight matrix W is an m×m matrix defined as

\begin{matrix} (6) & W = (w_{1} w_{2} \dots w_{m}) . \end{matrix}

By setting the maximum singular value of W to be less than unity, it is guaranteed that the ESN forgets distant past inputs and that the weights can be stably determined. Although NK22 employed a leaky echo state network, this study uses a network without a leak term, because the fitting of the substorm occurrence probability was slightly degraded when the leak term was introduced.

We represent the function ν as an exponential of the weighted sum of the state variables as

\begin{matrix} (7) & ν (t | β) = \exp (β^{T} x_{k}), (t_{k} \leq t < t_{k + 1}) . \end{matrix}

Here, we have used the parameter vector β for the weights for determining the output of the ESN. The value of β is obtained with a Bayesian approach. We take the prior distribution of β as a Gaussian distribution as

\begin{matrix} (8) & p (β) = \frac{1}{\sqrt{(2 π σ^{2})^{m}}} \exp [- \frac{β^{T} β}{2 σ^{2}}] . \end{matrix}

where m denotes the dimension of x_k and is 1000. We determine the standard deviation, σ, based on the marginal likelihood (see Appendix A) to avoid overfitting and underfitting. We set σ=0.6 for analysing the SML substorm onsets in Sect. 3 and σ=1.4 for analysing the Wp onsets in Sect. 4. We estimate β such that the following posterior probability density is maximised:

\begin{matrix} (9) & p (β | τ_{1 : N}) = \frac{p (τ_{1 : N} | β) p (β)}{p (τ_{1 : N})} . \end{matrix}

As p(τ_1:N) does not depend on β, we can obtain the optimal β by maximising the following objective function:

\begin{matrix} (10) & \begin{aligned} J & = \log [p (τ_{1 : N} | β) p (β)] = \log L (β) + \log p (β) \\ = \sum_{i = 1}^{N} \log ν (τ_{i} | β) - \sum_{k = 0}^{K - 1} ν (t_{k} | β) Δ t \\ - \frac{β^{T} β}{2 σ^{2}} - \frac{m}{2} \log (2 π σ^{2}), \end{aligned} \end{matrix}

using the Newton–Raphson method.

3 Analysis of auroral electrojet substorms

First, we analysed the SuperMAG substorm list by Newell and Gjerloev (2011 a) derived from the SuperMAG data (Gjerloev, 2012). This SuperMAG substorm list identifies substorm onset times from the SML index, which indicates westward auroral electrojet intensity. The SuperMAG list thus enumerates the events in which a westward auroral electrojet was developed. We refer to such events as auroral electrojet substorms (AE substorms). We trained the ESN with data for the 10 years from 2005 to 2014 so that the ESN output represents well the occurrence rate of the substorm onsets in the SuperMAG list. We used 5 min values of the OMNI solar-wind data as the input. As in NK22, we start the comparison after spin-up of the ESN for 72 steps so that the ESN satisfactorily memorises the history of the input data. If more than half of the data were missing for 1 h, we stopped the prediction and spun up the ESN again for the subsequent 72 steps. We then predicted the intensity function ν which represents instantaneous occurrence rate of substorm onsets.

Figure 2 shows an example of the prediction of the occurrence rate for the 3 d from 6 to 8 June 2015. In the top panel, the red line indicates the occurrence rate ν (h⁻¹) obtained with the ESN. The AE substorm onset times in the SuperMAG list are also plotted with blue triangles in this panel. For comparison, the SMU and SML indices (Newell and Gjerloev, 2011a, b), interplanetary magnetic field (IMF), solar-wind speed, and solar-wind density are shown in the second, third, fourth, and bottom panels, respectively. The IMF shown in the third panel is expressed in GSM coordinates, and the x, y, and z components are indicated with the green, blue, and red lines, respectively. The predicted occurrence rate tended to increase when substorm onsets, indicated with blue triangles, were frequently observed. This means that the ESN predicted the occurrence pattern of the AE substorms well.

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f02

Figure 2Predicted occurrence rate ν (h⁻¹) for AE substorm onsets (a), the SMU and SML indices (b), IMF (c), solar-wind speed (d), and solar-wind density (e) for the 3 d from 6 to 8 June 2015. The blue triangles in the first panel indicate the substorm onsets in the SuperMAG list. In the third panel, the x, y, and z components in GSM coordinates are plotted with the green, blue, and red lines, respectively.

Download

To assess the performance of the ESN-based prediction, we calculate the predicted probability of the substorm occurrence for each hour, obtained as

\begin{matrix} (11) & \begin{aligned} P_{k} & = 1 - \exp [- \int_{t_{k}}^{t_{k} + 12 Δ t} ν (t | β) d t] \\ = 1 - \exp [- \sum_{j = 1}^{12} ν (t_{k} + (j - 1) Δ t | β) Δ t] . \end{aligned} \end{matrix}

Note that 12Δt corresponds to 1 h because Δt was taken to be 5 min. We classified the data into 10 classes by the predicted probability ( $0 \leq P_{k} < 0.1$ , $0.1 \leq P_{k} < 0.2$ , …, and 0.9≤P_k) and calculated the actual occurrence ratio for each class. Figure 3 shows the actual occurrence ratio with respect to the predicted probability for AE substorms in the 4 years from 2015 to 2018. For reference, the gray line indicates the case where the occurrence ratio is equal to the predicted probability. The result with the red line shows that the predicted probability obtained with the ESN agrees well with the actual occurrence ratio except that the prediction slightly overestimates the occurrence ratio for $0.4 \leq P_{k} < 0.8$ .

The performance of a probabilistic prediction is often evaluated with the Brier score, defined as

\begin{matrix} (12) & B = \frac{1}{K} \sum_{k = 0}^{K - 1} {(r_{k} - P_{k})}^{2}, \end{matrix}

where r_k denotes the actual result. If the substorm occurred during the period $t_{k} \leq t < t_{k + 1}$ , r_k=1; otherwise, r_k=0. For example, with a non-informative prediction when the prediction P_k is always 0.5, $(r_{k} - P_{k})^{2} = 0.25$ for the entire period, and the Brier score becomes B=0.25. The Brier score gets smaller as the prediction improves. For the data from 2015 to 2018, the Brier score of the prediction with the ESN was 0.100. This Brier score should be compared with the case where we assume a stationary Poisson process where the occurrence rate ν is constant. If a stationary Poisson process is assumed and ν is optimised to fit the same data as used for training the ESN, the probability of substorm occurrence per hour is 15.6 %. Using this value, the Brier score with the stationary Poisson process becomes 0.136, which is worse than the score with the ESN. The better score with the ESN confirms that the information on the solar wind, which is used as the input for the ESN, effectively improves the prediction about the substorm occurrence.

4 Analysis of Pi2 substorms

In the following, we conduct a prediction of Pi2 substorms with the same approach as used in the previous section. This study identifies the Pi2 substorm onset using the Wp index (Nosé et al., 2009, 2012; World Data Center for Geomagnetism, Kyoto and Nose, 2016). The Wp index corresponds to the amplitude of low-latitude Pi2 pulsations averaged over the nightside longitudes, which is derived based on the wavelet analysis (Nosé et al., 1998). Nosé et al. (2012) proposed criteria for identifying Pi2 onsets from the Wp index. However, one of the criteria assumes quiescence of geomagnetic activity before the onset. To take into account Pi2 onsets during disturbed time, this study uses a different method for identifying Pi2 onsets. Although the original Wp index has a 1 min resolution, our identification recipe uses moving averages with window sizes of 5 min to remove the effects of noisy oscillations. We detect a time of a peak of the moving averages, T, which satisfies the following:

\begin{matrix} (13) & \overline{Wp} (T) > \overline{Wp} (T - 1 \min), \\ (14) & \overline{Wp} (T) > \overline{Wp} (T - 2 \min), \\ (15) & \overline{Wp} (T) \geq \overline{Wp} (T + 1 \min), \\ (16) & \overline{Wp} (T) > \overline{Wp} (T + 2 \min), \end{matrix}

where $\overline{Wp} (T)$ denotes the moving average of the Wp index at time T. We then find the time of the minimum $\overline{Wp}$ within the 15 min interval before the time of the peak and define it as T−m min ( $1 \leq m \leq 15$ ). If $\overline{Wp} (T) - \overline{Wp} (T - m \min) > θ nT$ , we identify this increase of $\overline{Wp}$ as a Pi2 event. To determine the onset time, we calculate the difference of $\overline{Wp}$ between 5 min apart:

\begin{matrix} (17) & \begin{aligned} Δ \overline{Wp} (T - j \min) & = \overline{Wp} (T - [j - 2] \min) \\ - \overline{Wp} (T - [j + 2] \min) \end{aligned} \end{matrix}

for $j = 2, \dots, m$ . Finding $j = j^{*}$ which maximises $Δ \overline{Wp} (T - j \min)$ , the time $T - j^{*} \min$ is regarded as the onset time. We hereinafter refer to the Pi2 events identified with the above criteria as Pi2 substorms, although they may contain pseudo-breakups and other phenomena which are not normally classified as substorms. In our event detection method, the threshold θ is a tunable parameter which determines the sensitivity of the detection. We set θ=0.1 nT and considered two other cases: θ=0.15 nT and θ=0.2 nT for comparison, as described later.

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f03

Figure 3Actual occurrence ratio with respect to the predicted probability for AE substorms in the 4 years from 2015 to 2018 (red). The gray line shows the case where the predicted probability is equal to the actual occurrence ratio.

Download

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f04

Figure 4Predicted occurrence rate ν (h⁻¹) for Pi2 substorm onsets identified from the Wp index (a), the Wp index (b), SMU and SML indices (c), IMF (d), solar-wind speed (e), and solar-wind density (f) for the 3 d from 6 to 8 June 2015. The blue triangles in the top panel indicate the Pi2 substorm onsets identified from the Wp index. The meaning of the colour in the fourth panel is the same as in the third panel of Fig. 2.

Download

We trained the ESN with data for the 10 years from 2005 to 2014. Again, we used 5 min values of the OMNI solar-wind data as the input. We started the comparison after spin-up of the ESN for 72 steps. If more than half of the data were missing for 1 h, we stopped the prediction and spun up the ESN again for the subsequent 72 steps. We then predicted the occurrence rate ν for substorm onsets. Figure 4 shows the prediction of the occurrence rate of Pi2 substorms for the 3 d from 6 to 8 June 2015, which is the same event as in Fig. 2. When identifying the substorm onsets, the threshold θ was set to be 0.1 here. In the top panel, the red line indicates the occurrence rate (h⁻¹) estimated with the ESN. The second panel shows the Wp index, and the third panel shows the SMU and SML indices. The fourth, fifth, and bottom panels show the IMF in GSM coordinates, solar-wind speed, and solar-wind density, respectively. Similarly to Fig. 2, the predicted occurrence rate tended to increase when Pi2 substorm onsets, indicated with blue triangles, were frequently observed.

We also calculate the predicted probability of substorm occurrence for each hour defined in Eq. (11). As in the previous section, we classified the data into 10 classes by the predicted probability ( $0 \leq P_{k} < 0.1$ , …, 0.9≤P_k) and calculated the actual occurrence ratio for each class. Figure 5 shows the actual occurrence ratio with respect to the predicted probability for Pi2 substorms in the 4 years from 2015 to 2018. The predicted probability obtained with the ESN mostly agrees with the actual occurrence ratio although slightly underestimated. We also calculated the Brier score in Eq. (12). The Brier score values were 0.208, 0.180, and 0.150 with θ=0.1, θ=0.15, and θ=0.2, respectively. If a stationary Poisson process is assumed, the Brier score values were 0.236, 0.200, and 0.165 with θ=0.1, θ=0.15, and θ=0.2. The Brier score gets smaller as the threshold θ increases due to the number of events. If a larger threshold is taken, fewer events are identified as substorms. This situation tends to depress the predicted occurrence probability P_k. At the same time, as the number of events decreases, r_k in Eq. (12) takes a value of 0 more frequently and (r_k−P_k)² accordingly decreases in more cases. Thus the Brier score depends on the number of events.

Comparing with the case of AE substorm onsets, the difference in the Brier score between the non-stationary and stationary Poisson process is likely to be smaller for Pi2 substorms. This presumably indicates that predicting Pi2 substorms is more difficult than AE substorms. Figure 6 is a histogram of the frequency of P_k for Pi2 substorms identified with θ=0.1, Pi2 substorms identified with θ=0.2, and AE substorms. For AE substorms, P_k was below 0.1 in most cases. In contrast, it was relatively rare that P_k was lower than 0.1 for Pi2 substorms with θ=0.1. If P_k approaches to 1, we can confidently predict the occurrence of a Pi2 substorm. If P_k approaches to 0, we can confidently deny the occurrence of a Pi2 substorm. However, P_k for Pi2 substorms was between 0.2 and 0.6 for many cases, which means that it is usually difficult to confidently predict whether a Pi2 substorm will occur or not. If the threshold is taken to be θ=0.2, the frequency of P_k<0.1 increased but it was still much less than the case of AE substorms. Moreover, the frequency of P_k≥0.7 for Pi2 substorms with θ=0.2 is less than that for AE substorms. Overall, a Pi2 substorm onset seems to be less predictable than an AE substorm onset.

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f05

Figure 5Actual occurrence ratio with respect to the predicted probability for Pi2 substorms in the 4 years from 2015 to 2018 with an occurrence threshold of θ=0.15 (red). The gray line shows the case where the predicted probability is equal to the actual occurrence ratio.

Download

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f06

Figure 6Histogram of frequency of P_k for Pi2 substorms identified with θ=0.1 (red), Pi2 substorms identified with θ=0.2 (green), and AE substorms (blue) for the period from 2015 to 2018.

Download

https://angeo.copernicus.org/articles/41/529/2023/angeo-41-529-2023-f07

Figure 7Occurrence rate ν (h⁻¹) under synthetic data predicted with the ESN trained with the AE substorm data (a), occurrence rate ν (h⁻¹) under synthetic data predicted with the ESN trained with the Pi2 substorm data (b), the synthetic IMF (c), solar-wind speed (d), and solar-wind density (e). In the top and second panels, the period from Day 13 to Day 18, when the solar-wind density is oscillated, is shaded with green. In the second panel, momentary enhancements of the occurrence rate at the northward turning of the IMF are indicated with blue arrows (see text). In the third panel, the y and z components of the IMF in GSM coordinates are plotted with the blue and red lines, respectively.

Download

5 Response to synthetic solar wind

To determine what the ESN learnt from the data, we conducted an experiment analysing the response of the trained ESN to synthetic solar-wind data. We used similar synthetic solar-wind data to NK22 in which the solar-wind parameters are fixed at constant values except that one of the parameters varying as a rectangular wave of various periods. Figure 7 demonstrates the experiment with the synthetic solar-wind data over 18 d. The top panel shows the occurrence rate of AE substorms predicted with the ESN. The second panel shows the occurrence rate of Pi2 substorms predicted with the ESN. The third, fourth, and fifth panels display the IMF, solar-wind speed, and solar-wind density, respectively. In this experiment, the IMF B_x and B_y were set to 0 and the temperature was fixed at 5×10⁵ K. In the first 3 d, IMF B_z varied as a rectangular wave between −5 and 1 nT with a period of 20 min for the first day, 2 h for the second day, and 6 h for the third day, while the solar-wind speed was fixed at 400 km s⁻¹ and the density was fixed at 2 cm⁻³. In the next 3 d (Day 4 to Day 6), IMF B_z varied with the same pattern, but the solar-wind speed became 600 km s⁻¹. Next, IMF B_z was fixed at 1 nT, and the solar-wind speed followed a similar rectangular pattern for 3 d (Day 7 to Day 9). After that, IMF B_z became −5 nT and the solar-wind speed again followed the same rectangular pattern for 3 d (Day 10 to Day 12). The solar-wind speed was then fixed at 600 km s⁻¹, and the solar-wind density was perturbed with a similar rectangular pattern under a fixed IMF B_z of 1 nT (Day 13 to Day 15) and −5 nT (Day 16 to Day 18).

The result suggests that both AE substorms and Pi2 substorms tend to frequently occur when the IMF is southward and the solar-wind speed is high. Newell et al. (2016) discussed the importance of the solar-wind speed based on the analysis of substorm onsets identified from the SML index. The second panel suggests that the Pi2 substorms are also strongly dependent on the solar-wind speed. Indeed, the effect of the solar-wind speed seems to be more essential for Pi2 substorms. From Day 4 to Day 6 when the solar-wind speed was high, the occurrence rate of Pi2 substorms tended to be high even under northward IMF situations, while the frequency of AE substorms tended to be depressed by a northward IMF. This is interpreted as showing that Pi2 pulsations can occur under geomagnetically quiet conditions without an intense auroral electrojet, which would be consistent with the result of Kwon et al. (2013). The existence of Pi2 pulsations under quiet conditions explains the poor predictability of Pi2 substorms suggested in Fig. 6. Since AE substorms are rare during northward IMF, we can confidently deny the occurrence of a substorm when the IMF is northward. In contrast, since Pi2 substorms can often occur even when the IMF is northward, it would be rare that we can confidently deny the occurrence of a substorm.

In contrast with the response to the IMF and solar-wind speed, the response to the solar-wind density variation is spiky, especially for AE substorms (from Day 13 to Day 18, shaded with green in the top and second panels). This may be interpreted as showing that a substorm is triggered by a sudden impulse (SI) of solar-wind dynamic pressure, although it is possible that the SI effect on the geomagnetic variation is misidentified as a substorm occurrence. It is also notable that a momentary enhancement of the occurrence rate is seen at the northward turning of the IMF on Day 6 for Pi2 substorms (blue arrows in the second panel). This might suggest that Pi2 substorms tend to be triggered by a northward turning of the IMF, as suggested by Lyons et al. (1997). However, when we tried the same experiment with the ESN trained with Pi2 substorms identified with a threshold of θ=0.2, the momentary enhancement of the predicted occurrence rate at the northward turning became unclear. Thus, the response to the northward turning of the IMF is not a distinct feature. As suggested by other studies (Morley and Freeman, 2007; Wild et al., 2009), the northward turning of the IMF is not likely to be essential to substorm occurrence even though the northward turning may give favourable conditions for a substorm occurrence.

Excluding spikes of the substorm occurrence rate concurrent with jumps in the solar-wind density, a higher solar-wind density is likely to have increased the occurrence rate from Day 13 to Day 15 when the IMF was weak and northward. In contrast, from Day 16 to Day 18 when the IMF was southward, the occurrence rate did not show a clear change between before and after jumps of solar-wind density except for spikes due to density jumps especially for AE substorms. These results suggest that the effect of solar-wind density depends on the IMF. When the IMF is weak and northward, the substorm occurrence rate is higher when solar-wind density is higher. Meanwhile, when the IMF is southward, the dependence of the occurrence rate on the solar-wind density is not clear except that the solar-wind density jumps can affect the occurrence rate. This trend is similar to the compound effect of the solar-wind density and IMF on auroral electrojets as identified by NK22 and other studies (Ebihara et al., 2019; Kataoka et al., 2023). The solar-wind density effect on the occurrence rate may thus contribute to the dependence of the auroral electrojet intensity on the solar-wind density. However, according to NK22, a similar compound effect is also observed in the AU index, which represents an eastward electrojet. Since eastward electrojets are not considered part of the substorm current system (e.g. Kamide and Kokubun, 1996), the occurrence rate cannot fully explain the compound effect on the auroral electrojets indicated by NK22.

6 Concluding remarks

This study proposed an approach for analysing event time series of substorm onsets with the occurrences influenced by solar-wind conditions. We treat a substorm onset as a stochastic phenomenon and represent the stochastic occurrences of substorms with a non-stationary Poisson process. The occurrence rate of substorms is then described with an echo state network model with a time series of solar-wind data as an input. The echo state network allows us to diagnose the relationship between the substorm occurrence rate and solar-wind conditions. We applied this proposed approach to a sequence of AE substorms and a sequence of Pi2 substorms. The ESN successfully predicted the probability of occurrences of AE and Pi2 substorms.

We also conducted experiments with a synthetic solar-wind data set and obtained the following results.

The occurrence rate is enhanced under a higher solar-wind speed and southward IMF for both AE and Pi2 substorms. The solar-wind speed effect is more important for Pi2 substorms than for AE substorms. While a high-speed solar wind can bring Pi2 substorms even under a northward IMF, AE substorms are rarely observed under a northward IMF even if the solar-wind speed is high.
The occurrence rate is enhanced at an abrupt jump of the solar-wind density. A northward turning of the IMF is also likely to momentarily enhance the occurrence rate of Pi2 substorms, although the effect of a northward turning seems to be minor.
A compound effect of the solar-wind density and IMF is also suggested. When the IMF is weak and northward, a higher solar-wind density is likely to cause a higher occurrence rate of substorms. In contrast, when the IMF is southward, the solar-wind density does not seem to make a clear effect on the occurrence rate of substorms except for spikes due to density jumps.

Appendix A: Parameter determination with marginal likelihood

Parameters of a Bayesian prior distribution are often determined based on the marginal likelihood (e.g. Morris, 1983; Casella, 1985). We determine the standard deviation of the prior distribution, σ, by maximising the marginal likelihood which is defined as

\begin{matrix} (A1) & p (τ_{1 : N} | σ) = \int p (τ_{1 : N} | β) p (β | σ) d β, \end{matrix}

where we denote the prior distribution in Eq. (8) as the conditional distribution given σ and p(β|σ). Using Eq. (10), Eq. (A1) is reduced to

\begin{matrix} (A2) & p (τ_{1 : N} | σ) = \int \exp [J] d β . \end{matrix}

Denoting the optimal value which maximising J as $\hat{β}$ , it should satisfy $\nabla J (\hat{β}) = 0$ . The Taylor expansion of J around $\hat{β}$ is thus

\begin{matrix} (A3) & J (β) = J (\hat{β}) + \frac{1}{2} (β - \hat{β})^{T} \nabla^{2} J (\hat{β}) (β - \hat{β}) + \dots, \end{matrix}

where $\nabla^{2} J (\hat{β})$ is the Hessian matrix at $β = \hat{β}$ . This Hessian matrix is negative definite because J is maximised at $β = \hat{β}$ . This second-order approximation of J yields an approximation of Eq. (A1) as follows:

\begin{matrix} (A4) & \begin{aligned} p (τ_{1 : N} | σ) \\ \approx \int \exp [J (\hat{β}) + \frac{1}{2} (β - \hat{β})^{T} \nabla^{2} J (\hat{β}) (β - \hat{β})] d β \\ = \frac{\sqrt{(2 π)^{m}} \exp [J (\hat{β})]}{\sqrt{| - \nabla^{2} J (\hat{β}) |}}, \end{aligned} \end{matrix}

where $| - \nabla^{2} J (\hat{β}) |$ is the determinant of the Hessian matrix of −J at $β = \hat{β}$ . This approximation is sometimes referred to as Laplace's approximation (e.g. Bishop, 2006). We chose the standard deviation, σ, which maximises the logarithm of the approximate marginal likelihood:

\begin{matrix} (A5) & \log p (τ_{1 : N} | σ) = J (\hat{β}) - \frac{1}{2} \log | - \nabla^{2} J (\hat{β}) | + \frac{m}{2} \log 2 π . \end{matrix}

Data availability

The AU and AL indices are available from the website of the WDC for Geomagnetism, Kyoto (http://wdc.kugi.kyoto-u.ac.jp/wdc/Sec3.html; World Data Center for Geomagnetism, Kyoto et al., 2015). The SMU and SML indices are available from the SuperMAG website (https://supermag.jhuapl.edu; Gjerloev, 2012). The SuperMAG substorm list can also be acquired through the SuperMAG website. The Wp index is available via the website of Masahito Nosé (https://www.isee.nagoya-u.ac.jp/~nose.masahito/s-cubed/; World Data Center for Geomagnetism, Kyoto and Nosé, 2016). The OMNI solar-wind data were acquired from the OMNIWeb of NASA/GSFC (https://omniweb.gsfc.nasa.gov/; King and Papitashvili, 2023).

Author contributions

SN conceived and conducted the analysis. RK contributed to the scientific interpretation. MN and JWG contributed to data processing and interpretation of the data.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

We acknowledge the substorm timing list identified by the Newell and Gjerloev technique (Newell and Gjerloev, 2011 a), the SMU and SML indices (Newell and Gjerloev, 2011 b), and the SuperMAG collaboration (Gjerloev, 2012).

Financial support

This research has been supported by the Japan Society for the Promotion of Science (grant no. 17H01704).

Review statement

This paper was edited by Anna Milillo and reviewed by two anonymous referees.

References

Amariutei, O. A. and Ganushkina, N. Y.: On the prediction of the auroral westward electojet index, Ann. Geophys., 30, 841–847, https://doi.org/10.5194/angeo-30-841-2012, 2012. a

Baker, D. N., Pulkkinen, T. I., Angelopoulos, V., Baumjohann, W., and McPherron, R. L.: Neutral line model of substorms: Past results and present view, J. Geophys. Res., 101, 12975–13010, https://doi.org/10.1029/95JA03753, 1996. a

Bishop, C. M.: Pattern recognition and machine learning, Springer, New York, ISBN: 978-0387310732, 2006. a

Casella, G.: An introduction to empirical Bayes data analysis, Amer. Stat., 39, 83–87, 1985. a

Cliver, E. W., Kamide, Y., and Ling, A. G.: Mountain and valleys: Semiannual variation of geomagnetic activity, J. Geophys. Res., 105, 2413–2424, 2000. a

Daley, D. J. and Vere-Jones, D.: An introduction to the theory of point processes, Vol. I, Elementary theory and method, 2nd Edn., Chap. 7, Springer, New York, ISBN: 978-0387955414, 2003. a, b

Davis, T. N. and Sugiura, M.: Auroral electrojet activity index AE and its universal time variations, J. Geophys. Res., 71, 785–801, 1966. a

Ebihara, Y., Tanaka, T., and Kamiyoshikawa, N.: New diagnosis for energy flow from solar wind to ionosphere during substorm: Global MHD simulation, J. Geophys. Res., 124, 360–378, https://doi.org/10.1029/2018JA026177, 2019. a

Gjerloev, J. W.: The SuperMAG data processing technique, J. Geophys. Res., 117, A09213, https://doi.org/10.1029/2012JA017683, 2012 (data available at https://supermag.jhuapl.edu/, last access: 19 November 2023). a, b, c

Gleisner, H. and Lundstedy, H.: Response of the auroral electrojets to the solar wind modled with neural networks, J. Geophys. Res., 102, 14269–14278, 1997. a

Jaeger, H. and Haas, H.: Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication, Science, 304, 78–80, https://doi.org/10.1126/science.1091277, 2004. a, b

Kamide, Y. and Kokubun, S.: Two-component auroral electrojet: Importance for substorm studies, J. Geophys. Res., 101, 13027–13046, 1996. a, b

Kataoka, R. and Nakano, S.: Reconstructing solar wind profiles associated with extreme magnetic storms: A machine learning approach, Geophys. Res. Lett., 48, e2021GL096275, https://doi.org/10.1029/2021GL096275, 2021. a

Kataoka, R., Nakano, S., and Fujita, S.: Machine learning emulator for physics-based prediction of ionospheric potential response to solar wind variations, Earth Planets Space, 75, 139, https://doi.org/10.1186/s40623-023-01896-3, 2023. a

King, J. and Papitashvili, N.: One min and 5-min solar wind data sets at the Earth's bow shock nose, NASA/GSFC [data set], https://omniweb.gsfc.nasa.gov/html/HROdocum.html (last access: 20 November 2023), 2023. a

Kwon, H.-J., Kim, K.-H., Jun, C.-W., Takahashi, K., Lee, D.-H., Jin, H., Seon, J., Park, Y.-D., and Hwang, J.: Low-latitude Pi2 pulsations during intervals of quiet geomagnetic conditions (K_p≤1), J. Geophys. Res., 118, 6145–6153, https://doi.org/10.1002/jgra.50582, 2013. a

Lui, A. T. Y.: Current disruption in the Earth's magnetosphere: Observations and models, J. Geophys. Res., 101, 13067–13088, https://doi.org/10.1029/96JA00079, 1996. a

Luo, B., Li, X., Temerin, M., and Liu, S.: Prediction of the AU, AL, and AE indices using solar wind parameters, J. Geophys. Res., 118, 7683–7694, https://doi.org/10.1002/2013JA019188, 2013. a

Lyons, R. R., Blanchard, G. T., Samson, J. C., Lepping, R. P., Yamamoto, T., and Moretto, T.: Coordinated observations demonstrating external substorm triggering, J. Geophys. Res., 102, 27039–27051, https://doi.org/10.1029/97JA02639, 1997. a

Lyons, R. R., Zou, Y., Nishimura, Y., Gallardo-Lacourt, B., Angelopoulos, V., and Donovan, E. F.: Stormtime substorm onsets: occurrence and flow channel triggering, Earth Planets Space, 70, 81, https://doi.org/10.1186/s40623-018-0857-x, 2018. a

Maimaiti, M., Kunduri, B., Ruohoniemi, J. M., Baker, J. B. H., and House, L. L.: A deep learning-based approach to forecast the onset of magnetic substorms, Space Weather, 17, 1534–1552, https://doi.org/10.1029/2019SW002251, 2019. a

Miyashita, Y. and Ieda, A.: Revisiting substorm events with preonset aurora, Ann. Geophys., 36, 1419–1438, https://doi.org/10.5194/angeo-36-1419-2018, 2018. a

Morley, S. K. and Freeman, M. P.: On the association between northward turnings of the interplanetary magnetic field and substorm onsets, Geophys. Res. Lett., 34, L08104, https://doi.org/10.1029/2006GL028891, 2007. a

Morris, C. M.: Parametric empirical Bayes inference: theory and applications, J. Amer. Statist. Assoc., 78, 47–55, 1983. a

Nakano, S. and Iyemori, T.: Storm-time field-aligned currents on the nightside inferred from ground-based magnetic data at mid latitudes: Relationships with the interplanetary magnetic field and substorms, J. Geophys. Res., 110, A07216, https://doi.org/10.1029/2004JA010737, 2005. a

Nakano, S. and Kataoka, R.: Echo state network model for analyzing solar-wind effects on the AU and AL indices, Ann. Geophys., 40, 11–22, https://doi.org/10.5194/angeo-40-11-2022, 2022. a

Newell, P. T. and Gjerloev, J. W.: Evaluation of SuperMAG auroral electrojet indices as indicators of substorms and auroral power, J. Geophys. Res., 116, A12211, https://doi.org/10.1029/2011JA016779, 2011a. a, b, c

Newell, P. T. and Gjerloev, J. W.: Substorm and magnetosphere characteristic scales inferred from the SuperMAG auroral electrojet indices, J. Geophys. Res., 116, A12232, https://doi.org/10.1029/2011JA016936, 2011b. a, b

Newell, P. T., Liou, K., Gjerloev, J. W., Sotirelis, T., Wing, S., and Mitchell, E. J.: Substorm probabilities are best predicted from solar wind speed, J. Atmos. Sol.-Terr. Phys., 146, 28–37, https://doi.org/10.1016/j.jastp.2016.04.019, 2016. a

Nishida, A.: Coherence of geomagnetic DP 2 fluctuations with interplanetary magnetic variations, J. Geophys. Res., 73, 5549–5559, 1968. a

Nosé, M., Iyemori, T., Takeda, T., Kamei, T., Milling, D. K., Orr, D., Singer, H. J., Worthington, E. W., and Sumitomo, N.: Automated detection of Pi2 pulsations using wavelet analysis: 1. Method and an application for substorm monitoring, Earth Planets Space, 50, 773–783, 1998. a

Nosé, M., Iyemori, T., Takeda, M., Toh, H., Ookawa, T., Cifuentes-Nava, G., Matzka, J., Love, J. J., McCreadie, H., Tuncer, M. K., and Curto, J. J.: New substorm index derived from high-resolution geomagnetic field data at low latitude and its comparison with AE and ASY indices, in: Proc. of XIIIth IAGA Workshop on Geomagnetic Observatory Instruments, Data Acquisition, and Processing, U.S. Geological Survey Open-File Report 2009-1226, edited by: Love, J. J., 202–207, U.S. Geological Survey, https://doi.org/10.3133/ofr20091226, 2009. a, b

Nosé, M., Iyemori, T., Wang, L., Hitchman, A., Matzka, J., Feller, M., Egdorf, S., Gilder, S., Kumasaka, N., Koga, K., Matsumoto, H., Koshiishi, H., Cifuentes-Nava, G., Curto, J. J., Segarra, A., and Çelik, C.: Wp index: A new substorm index derived from high-resolution geomagnetic field data at low latitude, Space Weather, 10, S08002, https://doi.org/10.1029/2012SW000785, 2012. a, b, c

Schillings, A., Palin, L., Opgenoorth, H. J., Hamrin, M., Rosenqvist, L., Gjerloev, J. W., Juusola, L., and Barnes, R.: Distribution and occurrence frequency of dB/dt spikes during magnetic storms 1980–2020, Space Weather, 20, e2021SW002953, https://doi.org/10.1029/2021SW002953, 2022. a

Takalo, J. and Timonen, J.: Neural network prediction of AE data, Geophys. Res. Lett., 24, 2403–2406, 1997. a

Viljanen, A., Tanskanen, E. I., and Pulkkinen, A.: Relation between substorm characteristics and rapid temporal variations of the ground magnetic field, Ann. Geophys., 24, 725–733, https://doi.org/10.5194/angeo-24-725-2006, 2006. a

Wei, D., Dumlop, M. W., Yang, J., Dong, X., Yu, Y., and Wang, T.: Intense $d B / d t$ variations driven by near-earth bursty bulk flows (BBFs): A case study, Geophys. Res. Lett., 48, e2020GL091781, https://doi.org/10.1029/2020GL091781, 2021. a

Wild, J. A., Woodfield, E. E., and Morley, S. K.: On the triggering of auroral substorms by northward turnings of the interplanetary magnetic field, Ann. Geophys., 27, 3559–3570, https://doi.org/10.5194/angeo-27-3559-2009, 2009. a

World Data Center for Geomagnetism, Kyoto and Nose, M.: Geomagnetic Wp index, Data Analysis Center for Geomagnetism and Space Magnetism, Graduate School of Science, Kyoto University [data set], https://doi.org/10.17593/13437-46800, available at https://www.isee.nagoya-u.ac.jp/~nose.masahito/s-cubed/ (last access: 20 November 2023), 2016. a, b

World Data Center for Geomagnetism, Kyoto, Nosé, M., Iyemori, T., Sugiura, M., and Kamei, T.: Geomagnetic AE index, Data Analysis Center for Geomagnetism and Space Magnetism, Graduate School of Science, Kyoto University [data set], https://doi.org/10.17593/15031-54800, available at http://wdc.kugi.kyoto-u.ac.jp/wdc/Sec3.html (last access: 19 November 2023), 2015. a

Articles

Short summary

Substorms are a phenomenon in the magnetosphere–ionosphere system, which are characterised by brightening of an auroral arc and enhancement of electric currents in the polar ionosphere. Since substorms are difficult to predict, this study treats a substorm occurrence as a stochastic phenomenon and represents the substorm occurrence rate with a machine learning model. We then analyse the response of substorm activity to solar wind conditions by feeding synthetic solar wind data into the model.