A Solar-wind-driven Empirical Model of Pc3 Wave Activity at a Mid-latitude Location

In this paper we describe the development of two empirical models of Pc3 wave activity observed at a ground station. The models are tasked to predict pulsation intensity at Tihany, Hungary, from the OMNI solar wind data set at 5 min time resolution. One model is based on artificial neu-ral networks and the other on multiple linear regression. Input parameters to the models are iteratively selected from a larger set of candidate inputs. The optimal set of inputs are solar wind speed, interplanetary magnetic field orientation (via cone angle), proton density and solar zenith angle (representing local time). Solar wind measurements are shifted in time with respect to Pc3 data to account for the propagation time of ULF perturbations from upstream of the bow shock. Both models achieve correlation of about 70 % between measured and predicted Pc3 wave intensity. The timescales at which the most important solar wind parameters influence pulsation intensity are calculated for the first time. We show that solar wind speed influences pulsation intensity at much longer timescales (about 2 days) than cone angle (about 1 h).


Introduction
Pulsations in the Pc3 band (22-100 mHz) have been observed to be influenced by conditions in the region upstream of the magnetosphere since the first in situ observations of the solar wind (SW) plasma.Subsequent studies investigating the sources of geomagnetic pulsations (especially in the Pc3-Pc5 bands) suggested that it is primarily high solar wind speed (V sw ) and radial interplanetary magnetic field (IMF) direction that facilitate the generation of dayside pulsations and their penetration into the magnetosphere, and that the frequency of Pc3 pulsations are linearly related to the magnitude of the IMF.Saito (1964) first showed that V sw is related to pulsation intensity and Bol 'shakova and Troitskaya (1968) found that pulsations preferentially occur during intervals when the IMF is approximately aligned with the Sun-Earth line.
The relationships discovered in the 1960s and 1970s have been confirmed by more recent studies (e.g.Verő, 1980;Wolfe and Meloni, 1981;Chi et al., 1998;Chugunova et al., 2007;Heilig et al., 2007).Most of these related solar-windbased parameters to pulsation activity by computing the linear correlations between parameters for specific events or over longer intervals (e.g.Heilig et al., 2007, who conducted a statistical analysis of data spanning 132 days).Wolfe and Meloni (1981) used a multiple linear regression (MLR) analysis and found that kinetic energy flux, in addition to V sw , drives Pc3 wave activity.Heilig et al. (2010) used MLR and artificial neural network (NN) based techniques to model the solar wind control of Pc3 wave intensity at 1 h time resolution.They found that pulsation intensity is best predicted by solar wind speed, dynamic pressure, IMF orientation (quantified by the cone angle ϑ B x = cos −1 (|B x |/B), first defined by Greenstadt and Olson, 1976), and local time.We improve on this model by developing a high time resolution (5 min running means of 1 min data) model of pulsation intensity, with solar wind parameters as input.
The empirical models developed here are based on estimating the causal relationship between solar wind plasma and magnetic field parameters and the intensity of Pc3 waves measured on the ground.Two distinct methods of modelling Published by Copernicus Publications on behalf of the European Geosciences Union.

S. Lotz et al.: Empirical Pc3 models
are employed: one model is based on NNs, and the other on MLR.
The structure of the NN-based model can be visualised as a directed graph with input nodes connected via weighted connections to a layer of intermediate nodes, which are in turn connected via weights to the output node.In this case the input nodes are the SW plasma, IMF and local time parameters and the output is Pc3 wave intensity (defined later).Network nodes are computational structures that apply a sigmoidal function (tanh in this paper) to the sum of all incoming signals.Each incoming signal is multiplied by the value of the connection weight.Developing the model involves "training" of the network.Training of the network means applying an optimisation algorithm known as error back-propagation to a large set of simultaneous observations of input and output parameters to adapt the weights connecting the nodes.
Multiple linear regression assumes that the relationship between n input parameters and one output parameter may be written as the sum of a product and a constant (D), with inputs raised to powers determined by linear regression (C is a constant): Taking the logarithm of Y (after setting D = 0) results in the sum log Y = log C +a 1 log x 1 +. ..+a n log x n , which may be solved by linear regression.A second regression is applied to calculate D. See Heilig et al. (2010) for a description of MLR model development.
The models developed here use solar wind parameters and a local time parameter as input, and a Pc3 wave activity index at Tihany, Hungary (at L ≈ 1.8) as the predicted output.Therefore these models are appropriate for mid-latitude locations where the dominant source of daytime Pc3 wave activity is the direct propagation of upstream waves deep into the magnetosphere (Heilig et al., 2010, for example).At higher latitudes other sources and wave propagation channels could be important, necessitating the development of separate highlatitude models.Specifically, Heilig et al. (2010) found a decrease in the influence of cone angle and local time on Pc3 wave intensity with increasing latitude.
The paper is laid out as follows: the data sets utilised are discussed in the next section.We also describe how measurements of solar wind parameters are shifted in time with respect to ground-based Pc3 measurements, to account for the propagation of waves through the sheath and magnetosphere.In the third section the development of the model is explained.The fourth section discusses the results, including the apparent timescales of SW influence on Pc3 waves.We conclude this investigation in Sect. 5 with a discussion of the results.

Data sets utilised
Estimating pulsation intensity from SW parameters with the empirical methods used in this study relies on analysing a large set of input (solar wind) and output (Pc3) parameter data.Measurements from the interval 2002-2007 are utilised.The data collected from the sources listed later in this section are 1 min averages of 1 s pulsation measurements and 16 or 64 s solar wind parameter measurements.The models are developed utilising 5 min running averages of the 1 min SW parameter and Pc3 pulsation data sets.

Data sources
Solar wind data are collected from the high-resolution OMNI (HRO) data sets (see http://omniweb.gsfc.nasa.gov/).The HRO sets list measurements of the solar wind plasma and magnetic field at 1 min averages.Measurements are made by a number of spacecraft, including ACE, Geotail and Wind.Measurements are shifted in time to account for the propagation time of the solar wind plasma from the spacecraft to the nose of the bow shock, and averaged to 1 min values.
The Pc3 pulsation activity utilised in this study are recorded in the horizontal (H) component of the geomagnetic field at the Tihany (Intermagnet code THY) geophysical observatory at (46.90 • N, 17.89 • E) geographic and (42.44 • N, 92.39 • E) geomagnetic coordinates, and L ≈ 1.84.The recording instrument is a fluxgate magnetometer with 1 Hz sampling rate in pico-Tesla (pT).H-component measurements are band-pass filtered in the Pc3 band (22-100 mHz) and the one minute root-mean-square of the filtered data is defined as the Pc3 index Pc3 ind (first used by Heilig et al., 2007).

Time delay from bow shock to ground
The perturbations that cause pulsations of the Earth's field have to propagate from the upstream region across the magnetosheath and into the magnetosphere for Pc3 waves to be observed on the ground.To accurately model the relationship between upstream activity and pulsation intensity on the ground, the upstream data set is shifted in time with respect to the downstream data (ground measurements).The total propagation time is the sum of the propagation from the upstream spacecraft to the bow shock, t u , the propagation through the sheath ( t sh ) and the propagation through the outer magnetosphere ( t sp ) up to the point where the incoming wave either couples to a field line resonant mode or propagates directly into the ionosphere: t = t u + t sh + t sp .
Measurements from OMNI are shifted in time to account for the solar wind flow from the spacecraft position to the subsolar point on the bow shock nose; i.e. the HRO data set includes the time delay t u .
According to Clausen et al. (2009) the propagation of compressional waves through the magnetosheath that drive pul-sations in the magnetosphere can be calculated using the process described by Khan and Cowley (1999).Propagation through the sheath is determined by sheath width, solar wind speed (V sw ), plasma flow speed into the magnetopause (V mp ) and the shock jump ratio with γ = 5/3 and M ms the magnetosonic Mach number.According to Khan and Cowley (1999) the propagation time (in seconds) through the sheath is The bulk speed of plasma flow into the magnetopause V mp is approximated (Khan and Cowley, 1999) as 20 km s −1 .Distance from the Earth to the bow shock nose is R bs ; this parameter is included in the HRO data set.The distance to the magnetopause in R E is approximated by R mp = 110.2(V 2 sw N p ) −1/6 by balancing the magnetic force from the Earth's field with the solar wind pressure (e.g.Walker and Russell, 1995).Calculating the propagation time through the sheath region for 2002-2007 shows that t sh varies between 0 and 24 min (fractions of minutes rounded down), with a mean of 5 min.Delays longer than about 10 min occur while the particle density in the solar wind is low (N p 1 cm −3 ), leading to an inflated magnetosphere.Low density anomalies are caused by the rarefaction of solar wind plasma associated with high-speed solar wind streams from coronal holes Usmanov (2005), causing the solar wind plasma to become sub-Alfvénic and the inflation of the magnetosphere to very large stand-off distances, e.g.53 R E during 1999 (Le et al., 2000).
It is assumed that the fast mode wave is responsible for carrying ULF energy through the magnetosphere at the Alfvén speed V A toward the Earth (Clausen et al., 2009).The Alfvén speed in the magnetosphere, based on electron density measurements, varies between about 1000 km s −1 in the inner magnetosphere (L ≈ 3 − 6) (Fraser et al., 1988) and about 3000 km s −1 in the outer magnetosphere (L ≈ 6-10) (e.g.Burton et al., 1970).The magnetopause standoff distance R mp varies between 7.3 and 26.9 R E during the 2002-2007 interval.Assuming V A = 2000 km s −1 throughout the magnetosphere, t sp < 85 seconds throughout the 2002-2007 interval.
To test the effect of applying t to solar wind parameters the correlation between ϑ B x and Pc3 ind (with and without the shift) is calculated.The correlations of ϑ B x (t − t) with Pc3 ind (t) are slightly higher than the correlations between ϑ B x (t) and Pc3 ind (t) for each year (see Table 1).
The propagation time through the sheath and the magnetosphere can also be estimated from empirical data.It was done by a cross-correlation analysis between the one minute resolution Pc3 ind observed at THY in 2003 and ϑ B x .Correlations were calculated in 6 h long time windows.The distribution of the strongest correlation is shown in the top panel of Fig. 1.Correlations are overwhelmingly negative as expected for the cone angle dependence.In the bottom left panel the time lag distribution corresponding to the cases when the correlation was negative peaks at −2.5 min (the mean lag is −2.7 min).This result somewhat contradicts the applied sheath propagation model suggesting that the model overestimates the propagation time.This discrepancy is not a subject of the present study.We also calculated the correlations presented in Table 1 with a constant 3 min time shift (not shown).Both time shifts applied have only a slight (although systematic) influence on the correlations and are very close to each other.Hence, the choice of the way the data are shifted do not influence significantly the results presented in the paper.However, the increase of the correlations due to the time shift is an important indicator, which helps us to identify the possible channels by which the Pc3 waves can get deep into the magnetosphere.

Selection of data sets
Since Pc3 pulsations occur predominantly on the dayside of the Earth (e.g.Yumoto, 1985;Le and Russell, 1994;Heilig et al., 2010)  Figure 2 shows how the distribution of Pc3 ind changes when the selection criteria (χ < 90 • and Kp < 4) are applied.Vertical lines indicate the median of each data set.The solid black stepped curve is the distribution of all observations (2002)(2003)(2004)(2005)(2006)(2007).When Pc3 ind is restricted to quiet periods (blue curve) the distribution becomes narrower and the me- dian decreases due to the absence of large perturbations during active periods.The bulk of Pc3 energy is observed on the dayside of the Earth, as is clear from the flatter distribution (indicating a larger fraction of large perturbations) when restricted to χ < 90 • (dashed steps).Combining χ < 90 • and Kp < 4 (red curve) results in a flatter (or slightly narrower) distribution than when only Kp < 4 (or χ < 90 • ) is enforced.
In the development of these models, we use two distinct sets of data.The training set (TRN) is used to adapt the weights during the NN training process, and to determine the constants in MLR.The test set (TST), which is distinct from the TRN set, is used to objectively gauge the performance of the models.The TRN set is compiled of selected data from 2002, 2004 and 2006 by randomly selecting 50 000 data points from each year (i.e. 150 000 in length).Similarly, 7500 data points from 2003 and 2005 are randomly selected to construct the TST set (resulting in a set of length 15 000).

Model development
The development of the NN and MLR models involves selecting the set of solar-wind-based parameters that best relates to Pc3 wave activity.In order to make this selection we iteratively add input parameters to the model from a set of candidate inputs.This process yields an optimal subset of input parameters from the larger set of candidates.

Candidate input parameters
The set of candidate input parameters consist of six SW parameters and one local-time-related quantity.The eight solar wind and IMF parameters are solar wind speed V sw , proton density N p , interplanetary electric (E) and magnetic (B) field magnitude, cone angle ϑ B x and Alfvénic Mach number M A .The solar zenith angle (χ), derived from UT time and the location of the geomagnetic observatory (THY), is included in the set of input parameters to represent the local time dependence of Pc3 wave intensity.Table 2 lists the correlation coefficient (ρ) between the candidate inputs and Pc3 ind for data selected as described in Sect.2.3.The negative correlation with χ is due to the increase in Pc3 wave activity as local time heads toward noon, where χ has a local minimum.A strong correlation between the pulsation index Pc3 ind and V sw suggests the downstream convection of perturbations by the solar wind.The cone angle effect is evident through the strong negative correlation between ϑ B x and Pc3 ind .There are no great correlations between Pc3 ind and E, B, N p or M A .It is important to note that these are merely linear correlations and that higher-order dependencies are not resolved by ρ.For this reason the importance of solar wind parameters to Pc3 generation cannot be discounted due to small linear correlations with Pc3 ind (see Heilig et al., 2010).

Selection of model input parameters
The candidate parameters listed above are used in various combinations as the input parameters to the models with Pc3 ind as output.The goal of the development is to find the subset of input parameters that optimally predict the output.This is achieved by comparing model output ( Pc3 ind ) and target output values (Pc3 ind ) for NN and MLR models with different sets of input parameters.Apart from the input parameters selected, all other configuration parameters, such as the number of training cycles, the learning rate (the factor by which NN weights are adapted) and the stopping criteria are kept constant.
Model fitness is quantified by the correlation coefficient between measured and predicted output (Pc3 ind and Pc3 ind ).RMSE is calculated but not shown because the results are similar for both fitness metrics.The set of input parameters is selected through an iterative procedure: in the first round of development each network has one input parameter from the set of candidates.The input yielding the fittest model is identified, and in the second round of development each network has two input parameters: the identified parameter and one of the remaining candidates.The process continues in this fashion until all candidates are included or no improvement in performance is observed.
The training process is illustrated by Table 3 and Fig. 3.The first eight columns of Table 3 indicate the model number and the input parameters.The last two columns indicate the correlation between measured and predicted output for MLR and NN.In the first round of training seven models (01-07) are trained, each with a different input parameter (indicated by the x in the table), and the output Pc3 ind from each network is gauged for fitness according to ρ.The NN and MLR models with V sw as input (model 2) is the fittest, with (ρ = 0.484 and 0.435), ahead of model 6 with ϑ B x as input.After the second round ϑ B x is added to the set of optimal input parameters (model 12) correlation of 0.616 and 0.595 between measured and predicted output; model 9 yields the second best result, with (ρ = 0.552, 0.484).Round three of the development process adds N p to the set and in the fourth round model 19 with V sw , ϑ B x , N p , and χ yields the best output with (ρ = 0.72, 0.669).Performance increases as parameters are added to the models, yielding smaller improvements with every round of development (see Fig. 3).The addition of any of the three remaining candidate inputs (E, B or M A ) does not improve the performance of the models.
The optimal MLR model may be written as The cosine of the cone angle yielded better results than using ϑ B x as input, and the 2 is added because negative values of log(cos ϑ B x ) are not defined (Heilig et al., 2010).3.

Results
In Fig. 4 the observed and predicted output from the winner of every round of development is plotted for days 353-355 of 2007.For this plot entire days, not only χ < 90 • , are plotted.The model in the top-left panel has only V sw as input and the output from this model is clearly only responding to the slow variation in V sw .The higher-resolution component of Pc3 ind is not resolved at all.The addition of ϑ B x to the set of input parameters (model 12 in Table 3 and topright panel of Fig. 4) clearly increases the model's response to short timescale variations in Pc3 ind .Adding N p to the set of inputs improves the resolution of some of the higher peaks in Pc3 ind , and the addition of χ enables the model to account for diurnal variation.

Comparison between high-resolution and low-resolution models
This study aims to improve on the 1 h resolution model developed previously by Heilig et al. (2010).Development of the low (1 h) resolution model followed a similar procedure whereby a set of SW-based parameters is selected through an iterative process from a larger set.The low-resolution model (Heilig et al., 2010) is not directly comparable to the highresolution version because data sets were not restricted to local day time, only to Kp < 4.
In order to compare our results with a low time resolution model, like the one developed by Heilig et al. (2010)  model.Similar discrepancies between the measured and predicted values are observed in both models.Between 04:00 and 06:00 UT both models slightly overestimate the observed Pc3 ind .After the sharp increase in Pc3 ind observed at about 07:00 UT both models overestimate Pc3 ind and take longer to decrease to sub-50 pT levels than the observations.Between 13:00 and 14:00 UT the high-resolution model overestimates the width of the narrow peak in Pc3 ind ; the peak is averaged out in the low-resolution data.
The obvious improvement that the high-resolution model offers is that the influence that ϑ B x has on Pc3 wave activity is more accurately resolved.
Inaccuracies in the estimates of wave propagation times from the spacecraft to the bow shock nose (calculated by OMNI) and through magnetosheath and magnetosphere (described in Sect.2.2) may contribute significantly to the reduced performance of the high-resolution model, compared to the low-resolution model.Errors will be significantly greater over short times (e.g. 5 min) than over longer periods (e.g. 1 h).Furthermore, Bier et al. (2014) showed that the cone angle values derived from OMNI may be inaccurate -especially when the IMF is oriented approximately parallel to the Sun-Earth line, when high Pc3 wave activity is expected.3).Input parameters for each model are listed in brackets.

Timescale of solar wind influence
During model development we see that the addition of ϑ B x to the set of inputs allows the model to resolve some of the rapid variations in Pc3 ind .This suggests that IMF direction (ϑ B x ) influences pulsation intensity at a shorter timescale than SW speed.Indeed it is expected, since variation in V sw simply happens over longer timescales than in ϑ B x .To quantify this we calculate the average rise and fall times of V sw and ϑ B x for the entire period 2002-2007.Rise time (τ R ) and fall time (τ F ) are characteristic timescales usually associated with voltage or current step functions in electronics.It is defined (Levine, 1996) as the time required for a signal to rise (fall) from a level x(y) to a level y(x), with x < y.We define x and y as the 10th and 90th percentile levels of V sw and ϑ B x .Missing values in V sw and ϑ B x are handled by linear interpolation.
Here the rise time is defined as the average time it takes a parameter to rise from below its 10th percentile level to above its 90th percentile level, and the fall time is the average time it takes to fall from above the 90 % level to below the 10th percentile.One-minute time resolution data sets for each year (2002)(2003)(2004)(2005)(2006)(2007) are used to calculate τ R and τ F .The 10th and 90th percentiles, and the rise and fall times of V sw and ϑ B x are listed in Table 4.The rise time of V sw is 1514 and the fall time of V sw is 2777 min.Rise and fall times of ϑ B x are 38 and 40 min, respectively.In order to quantify the difference in timescale of influence we compare moving averages of Pc3 ind at different window lengths with V sw and ϑ B x .Utilising the entire data set from 2003, we computed the correlation strength between Pc3 ind and ϑ B x at various timescales.Selection according to Kp (< 4) and χ (< 90 • ) is made only after the moving averages are calculated so that filter windows do not overlap with the gaps in time between selected intervals.The resulting correlation coefficients are shown as a solid black line in Fig. 6.The (negative) correlation first gets stronger moving from minute means to longer time scales, up to about halfhour means.For boxcar windows longer than 1 h the correlation strength (in absolute sense) rapidly decreases.This is in with the approximately 40 min rise and fall times of ϑ B x calculated above.At longer timescales the typical variations in ϑ B x that enable Pc3 wave activity are smoothed out.Correlation between ϑ B x and Pc3 ind starts to increase again near the daily means reaching a maximum at the 5-day window length.
The optimal time lag, computed by using the crosscorrelation between ϑ B x (from the OMNI2 data) and Pc3 ind for 2003, was empirically found to be 3 min, as described in Sect.2.2.We apply this constant time shift to the 2003 data and recalculate the correlation coefficients at all timescales.At the shortest timescales the correlation was stronger as expected; however, at longer scales there was no significant difference, as shown by the dotted black line in Fig. 6, overlapping with the solid/squares line.
The time-shift process applied to OMNI data smooths the original spacecraft measurements somewhat and hence the variance of OMNI cone angle is smaller than what was measured at ACE, for example.That is why the correlations were again recalculated this time using direct solar wind and interplanetary magnetic field measurements of ACE satellite shifted in time.Convection times were calculated based on the position of ACE, the position of the bow shock estimated from a model and the solar wind speed.All data points were shifted in time with the corresponding convection time, after that the time series was resampled at a 1 min sampling rate.Magnetospheric propagation was taking into account by another 3 min shift in time.As expected the correlation became stronger at all timescales (except for the shortest).Although the technique to correct for the convection time described here keeps higher variance of the solar wind data than the OMNI data set, it may introduce large errors by not taking into account the orientation of the IMF, when the IMF has a significant X component (e.g.Weimer and King, 2008), i.e. when the cone angle is low.This error affects the correlation, because of its dependence on the cone angle.Even so, the increase in correlation as a result of substituting the OMNI data for time-shifted ACE data clearly demonstrates the loss of information due to the smoothing applied to OMNI data.Finally, the partial correlations between Pc3 ind and ϑ B x were also computed.Partial correlation gives the correlation between variables cleaned from the influence of other parameters (Heilig et al., 2010).The red dash-dotted line in Fig. 6 shows correlation between Pc3 ind and ϑ B x with the influence of V sw and N p removed.The partial correlation was found to be the strongest near the 1-3 h timescales, and decreasing at larger timescales.In addition to errors in the estimation of propagation time through the magnetosheath and magnetosphere, we believe that the relatively lower correlation at the shortest timescales is the consequence of the derivation of the solar wind data, and that direct measurements at the bow shock nose would yield higher correlations.Indeed, Bier et al. (2014) showed that IMF orientation observed near L1 is not always the same as the orientation impacting on the bow shock nose, i.e. cone angle data derived from the OMNI data set is not 100 % accurate (the data set they analysed showed 80 % accuracy).At the largest scales the lower correlation follows from the fact that averaging over timescales much longer than the optimal length results in loss of information.Based on the above considerations a timescale of about 1 h seems to be the optimum for investigating the cone angle influence of magnetospheric processes (at least for this data set).
The correlation strength depends not only on the time resolution of the data, but also on the length of the data set for which the correlation is calculated (in the following part of this section all calculations were made using the entire 2002-2007 3 min shifted Pc3 ind and OMNI data set).This is clearly illustrated in Fig. 7, where the partial correlation coefficients of Tihany Pc3 ind and OMNI ϑ B x (influence of V sw , N p and χ removed) is shown as a function of timescale (horizontal axis) and correlation window length (vertical axis).The av-  erage partial correlation peaks when the correlation window is a few days long.The correlation is weaker when calculated for shorter or longer data sets.This behaviour is very different from the nature of the relation between Pc3 ind and solar wind speed (Fig. 8) -with the influence of the other SW parameters and χ removed.SW speed has the strongest influence at the largest timescales.Moreover, the correlation is significant only when calculated for data sets longer than about a day, although the strength of the correlation does not increase much for longer data sets.

Conclusions
We created a high-resolution (5 min) model to estimate the intensity of Pc3 pulsations at a middle-latitude station (THY) from solar wind and local time parameters.A rigorous selection procedure is applied to select the most important input parameters to the model.Out of a set of candidate inputs, V sw , ϑ B x , N p and χ are selected.Models with these inputs achieve correlations of 0.72 (MLR) and 0.69 (NN) between measured and predicted output.The three upstream parameters are all influential in the excitation and downstream propagation of ULF waves in the region upstream of the bow shock and their importance to the NN model suggests that UWs are the dominant drivers of mid-latitude dayside Pc3 waves on the ground.This is in agreement with previous studies by Yumoto (1985), Verő (1980), Bier et al. (2014), Heilig et al. (2007) and Heilig et al. (2010), for example.Other extra-magnetospheric input parameters, such as the F10.7 flux, may be included in future models.Vellante et al. (2007) showed that solar radiance (measured by F10.7 flux) affects the amplitude of Pc3 waves, through the dependence of plasmaspheric mass density on F10.7 flux.Plasmaspheric mass density influences the penetration depth of upstream waves, and the resonant frequency of field lines through their integrated mass density and its field-aligned distribution (Vellante et al., 1996(Vellante et al., , 2007)).Changes in the local resonant frequency affects the resonant coupling between the incoming compressional upstream waves and the Alfvén waves propagating along that field line.Globally, density variations change the redistribution of the Pc3 wave energy, causing amplitude variations locally.The dependence of amplitude on F10.7 flux reported by Vellante et al. (2007) occur on timescales ranging from 1 day to 4 years, making this a useful parameter for modelling long-term variations in Pc3 wave amplitude.
The MLR and NN models yielded very similar results, with the same input parameters emerging from the modelling process.Apart from the slightly higher correlation between measured and predicted output, the MLR model yields a relatively simple relation between input and output (Eq.4), whereas the equivalent NN model is much more complicated to write down due to the nature of the interaction between the different computational nodes.In this case, the MLR method is superior to NNs.
We show for the first time explicitly the timescale at which solar wind speed, density and IMF direction influence Pc3 wave activity.A comparison between moving averages of the solar wind parameters, at several different window widths, and Pc3 ind is made.It shows that V sw has the highest correlation with Pc3 ind at timescales of about 2 days, while ϑ B x maximally influences Pc3 ind at the 1 h timescale.This is explained by the two orders of magnitude difference in the rise and fall times of V sw and ϑ B x .

Figure 1 .
Figure 1.Correlation between Pc3 ind and cone angle for 2003.The top panel shows the number of cases binned by correlation coefficient.The bottom panels show the number of cases for positive (bottom right) and negative (bottom left) correlations.

Figure 2 .
Figure 2. Distribution of Pc3 ind for the selection criteria listed in the legend.Vertical lines and numbers indicate the median of each selection.

Figure 3 .
Figure 3.The correlation coefficient for each NN (solid line) and MLR (dashed line) model.The winners of each round of training are indicated with squares.The values are listed in Table3.
, we train a NN based on hourly averaged data with input parameters V sw , ϑ B x , N p , and χ.The training data set consists of all instances where Kp < 4 and χ < 90 • , from 2002-2006.Data from 2007 are used for evaluation.Figure 5 shows measured and predicted output from the low-resolution (1 h averages) and high-resolution (5 min) models for day 140 of 2007.The correlation between measured and predicted output is 0.78 for the low-resolution model and 0.68 for the high-resolution www.ann-geophys.net/33

Figure 4 .
Figure 4. Measured (black) Pc3 ind and the corresponding predictions (in red) by networks 02 (V sw ), 12 (V sw , ϑ B x ), 15 (V sw , ϑ B x , N p ) and 19 (V sw , ϑ B x , N p , χ ) for days 353-355 of 2007.These are the fittest models in each round of training (see Table3).Input parameters for each model are listed in brackets.

Figure 5 .
Figure 5. Measured and predicted Pc3 ind from the low-resolution (1 h) and the high-resolution (5 min) models.

Figure 6 .
Figure 6.Correlation between Pc3 ind and ϑ B x at different timescales, for unshifted and shifted OMNI2 data, shifted ACE data.The red line shows partial correlation between ACE ϑ B x and THY Pc3 ind .

Figure 7 .
Figure 7. Correlation between Pc3 ind and ϑ B x (colour scale) over different timescales (horizontal axis) and data set lengths (vertical axis).

Figure 8 .
Figure 8. Correlation between Pc3 ind and V sw (colour scale) over different timescales (horizontal axis) and data set lengths (vertical axis).

Table 1 .
Correlation between Pc3 ind and cone angle (ϑ B x ) with (t − t) and without (t) the propagation time delay, calculated in Sect.2.2, applied.
year ϑ B x (t) ϑ B x (t − t) only measurements taken during local (THY) day time are considered.Seasonal and diurnal variations are captured by solar zenith angle χ .The solar zenith angle changes throughout the day, from 90 • at sunrise, to a minimum value that depends on the location and season at noon, to values larger than 90 • after sunset.During geomagnetically active periods the Pc3 band is flooded with storm time wave activity driven by mechanisms other than those driving Pc3 waves.Therefore, the model development is based on SW and (shifted) Pc3 ind measurements that coincide with instances where χ < 90 • and Kp < 4.

Table 3 .
Model performance during the wrapper process.Every model (NN and MLR) has a different set of input parameters, marked with x.The correlation ρ between measured and predicted output for the NN-and MLR-based models are listed in the last two columns.The fittest model in each round is number 2, 12, 15 and 19 (highlighted with bold font).Model # χ V sw N p E B ϑ B x M A ρ MLR

Table 4 .
Mean rise and fall times of V sw and ϑ B x for 2002-2007.The 10th and 90th percentiles of V sw and ϑ B x are indicated by P 10 and P 90 .