neural network mapping techniques

A regional reference model of total electron content (TEC) was constructed using data from the GPS Earth Observation Network (GEONET), which consists of more than 1000 Global Positioning System (GPS) satellite receivers distributed over Japan. The data covered almost one solar activity period from April 1997 to June 2007. First, TECs were determined for 32 grid points, expanding from 27 to 45° N in latitude and from 127 to 145° E in longitude at 15-min intervals. Secondly, the time-latitude variation averaged over three days was determined by using the surface harmonic functional expansion. The coefficients of the expansion were then modeled by using a neural network technique with input parameters of the season (day of the year) and solar activity (F10.7 index and sunspot number). Thus, two-dimensional TEC maps (time vs. latitude) can be obtained for any given set of solar activity and day of the year.


Introduction
One of the important effects of the ionosphere on radio waves is a propagation delay in the ionosphere.This delay depends on the frequency and the total electron content (TEC) along the propagation path.Several attempts have been made to specify the ionospheric electron density using theoretical (see Anderson et al., 1998) and empirical approaches.Considerable effort has resulted in the continuous development and improvement of the International Reference Ionosphere (IRI) (Bilitza, 2001), which describes the density at various heights for any specified geophysical conditions, based on long-term observations.TEC can be derived by integrating Correspondence to: T. Maruyama (tmaru@nict.go.jp) the height profile of the electron density.The contribution of the plasmaspheric electron density to TEC cannot be neglected (Gulyaeva and Titheridge, 2006;Gulyaeva and Gallagher 2007;Cueto et al., 2007;Reinisch et al., 2007).However, observations of the electron density to construct a reliable model are limited in the topside ionosphere and plasmasphere compared with the bottomside and the F-layer peak (Bilitza and Williamson, 2000).
Direct measurements of TEC using radio waves transmitted from the Global Positioning System (GPS) satellites have been collected in this decade.Thus, GPS-based TEC data are now available to construct empirical models of TEC.Meanwhile, artificial neural network (NN) techniques have been applied to a variety of topics in the study of the upper atmosphere.Multilayer feed-forward networks (Rumelhart et al., 1986;Haykin, 1994) are used to specify the ionosphere by approximating a relationship between geophysical conditions (seasons, solar activities, local times, longitude/latitude etc.) and observed ionospheric parameters (foF2, h'F2, hmF2, etc.) (Williscroft and Poole, 1996;McKinnell and Poole, 2004;Oyeyemi et al., 2005), short-term forecasting of ionospheric conditions (Altinay et al., 1997;Cander et al., 1998;Kumluca et al., 1999;Wintoft and Cander, 2000;Poole and McKinnell, 2000;Oyeyemi et al., 2006), and long-term trend analyses (Poole and Poole, 2002;Yue et al., 2006).Because of the input-output mapping features of NNs, they could be used to generate reference ionospheric models for possible incorporation into the IRI (McKinnell and Friedrich, 2007).For this purpose, a so-called training data set must cover a whole range of possible input parameter variations, say, a data period longer than one solar cycle.
In Japan, a dense GPS receiver network, GEONET (GPS Earth Observation Network) has been developed, and data from more than 1000 locations have been available since April 1997, close to a solar minimum.An algorithm that simultaneously determines satellite/receiver biases and vertical TEC using GEONET data has been developed by Ma and T. Maruyama: Regional reference total electron content model  1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 Year Figure 2 Maruyama (2003).By using this algorithm, we constructed a TEC database that nearly covered one solar cycle from 1997 to 2007.The data set used to develop the regional reference TEC model and the NN technique are described in Sect. 2. The performance of the trained network is evaluated in Sect.3. Section 4 summarizes the work.

Data set and methodology
About 300 GEONET receivers were chosen for this study to ensure uniform coverage over Japan.The major issues in deriving TECs from GPS radio signal observations are the instrumental biases both in the satellites and receivers, and the conversion process from the observed TECs along the slant path to the vertical ones.In this paper, the slant TECs were converted to vertical TECs (vTEC) at the piercing point where the ray path crossed a shell at a height of 400 km (thin shell model).Assuming that the vTECs in a small cell 2×2 • in longitude and latitude were the same in a short period of time, we calculated the daily instrumental biases and vTEC in each cell with a 15-min period, using the least-squares fitting method for a data set that covered 24 h.More details of the method are described elsewhere (Ma and Maruyama, 2003).The vertical TEC obtained in this way is referred to as the grid TEC (gTEC).The original TEC grid consisted of 32 grids, as shown in Fig. 1 (the southern most two grids were not used when constructing the actual model because gaps of a few hours in the data occurred, depending on the change in the satellite constellation).
The major factors that determine the TEC are the solar activity, season, local time, and geographic and geomagnetic coordinates.Our process to generate a model consists of three steps: Step 1 is the procedure described in the previous paragraph in which the gTECs at the grid points are determined at 15-min intervals.
Step 2 is the procedure where variations in time and latitude are expressed as a twodimensional distribution map for three consecutive days.Because the data grid is limited to a narrow longitudinal extent and geomagnetic conditions do not change greatly among the east-west aligned grids, the longitudinal dependence is assumed to be equivalent to the local mean time (LMT) in this step.
Step 3 is the procedure where the solar activity and seasonal changes in the TEC map are modeled by using a neural network technique.
To generate TEC maps from the gTECs, we used the surface harmonic expansion method based on the associated Legendre function, as shown in (1), taking the LMT (hour) at each grid point as the azimuth parameter, i.e. φ = 2π(LMT/24), where θ is the colatitude; we took the degree and order up to 7 (N =M=7).As the grid point distributes at only northern mid-latitudes, dummy data were set, for mathematical convenience, in the Southern Hemisphere as a mirror image of the Northern Hemisphere with respect to the equator.The functional fitting was performed to determine coefficients A nm and B nm in (1).As dummy data were set in the Southern Hemisphere, the whole global distribution map is symmetrical with respect to the equator (resultant map data outside the latitude range from 29 to 45 • N were disregarded).In other words, A nm and B nm with the odd number of n+m are equal to zero.Thus, a total of 36 target parameters needed to be determined.
The solar activity expressed by two proxies, the F10.7 solar flux (sfu = 10 −22 W m −2 Hz −1 ) and the sunspot number, R, for the whole data period is shown in Fig. 2. The figure shows that both proxy parameters vary in a similar way but are not exactly the same.For example, from the latter half of 2001 to 2002, the solar flux reached a maximum, but the sunspot number reached a maximum in 2000.Thus, both parameters were included in the input parameters of the network.To successfully separate the seasonal and solar activity dependence of the TEC after training a neural network, the combination of both parameters must be homogeneously distributed in the training data set.Figure 3 shows the seasonal distribution of the solar activity proxies from 1997 to 2007, which indicates a homogeneous distribution in the range between 75 and 200 for F10.7 and between 0 and 150 for R. Within this range, a neural network is expected to separate the seasonal and solar activity dependence of the TEC.
We adopted the multilayer feed-forward network (Rumelhart et al., 1986;Haykin, 1994) that consisted of the input layer, one hidden layer, and the output layer.The schematic diagram of the network is shown in Fig. 4. The input layer had 8 nodes for solar activity (F10.7 and R averaged over three days, a week, and three solar rotations (81 days), including the days in which the TEC was specified and prior to those days) and season (sin and cos components of day of the year).The number of nodes in the hidden layer was chosen to be 200.The output parameters were the 36 Legendre coefficients, as described in this section.For the first several tenths of the epochs in the back-propagation learning process, weight updating was performed by the pattern mode in which weights were updated after the presentation of each training example (Haykin, 1994).The order of the presentation of training examples was randomized from one epoch to the next.After the weights were coarsely determined, weight updating was continued by the batch mode in which weights were updated after the presentation of all the training examples that constituted an epoch (Haykin, 1994).

Results
To evaluate the performance of the neural network mapping, we ran the network learning for the data set, excluding a partial data set for 2003.After the learning was completed, the network outputs were compared with observations for 2003.The results for the four typical seasons are shown in Figs. 5 to 8. The upper panel of Fig. 5 is the TEC distribution map (local mean time vs. latitude) for the March equinox 2003, reconstructed using the coefficient vector predicted by the network.The lower panel of Fig. 5 is the mean TEC distribution map generated using observations over the 27-day      Figure 6 is the same as Fig. 5, except for the June solstice.The two maps have quite similar diurnal and latitudinal patterns and absolute TEC values.A distinct feature of this season was the formation of three diurnal peaks of TEC and a small day-night difference.The morning peaks are centered at 07:30 LT in both maps, even though the amplitude is small for the observed 27-day means (lower panel).The times of the afternoon peaks shifted slightly earlier in the network predicted map (upper panel) than in the observed map.However, their trend of an earlier appearance at higher latitudes is common in both maps.The times of the evening peaks are almost the same for both maps.Figures 7 and 8 are the same as Fig. 5, but they are for the September equinox and the December solstice, respectively.In these figures, the network predicted (upper panels) and the observed (lower panels) maps are very similar in many features and absolute values.For more comparisons, the TECs at noon at grid point 14 (35 • N, 137 • E) near central Japan, as denoted by the asterisk in Fig. 1, were calculated and are shown in Fig. 9.The top panel is the solar activity inputs, F10.7 (upper traces) and R (lower traces), averaged over three days (dots connected with thin lines) and three solar rotations (solid lines).The middle panel is the grid TECs (circles connected with thin line) and network outputs (thick solid line).Not only seasonal variations, but also solar activity dependences are reproduced by the network.Interestingly, at the end of October, the solar flux was quite large, but the grid TEC did not increase very much.This was well reproduced by the network.When the solar activity indices averaged over three solar rotations were not incorporated into the input parameter, this moderate increase in TEC against the extremely intense solar activity at the end of October was not successfully reproduced.On the other hand, the enhanced grid TEC found in the latter half of May was not reproduced by the network.In this period, the solar activity was not so high as compared with the solar activity in the solar rotation before and after this period.This suggests that the solar activity indices we used are not entirely accurate proxies of solar EUV flux.
In our model, TEC variations associated with magnetic storms were not considered, while magnetic activities largely affect values of TEC.Discrepancy between the observed TECs and network predictions, as seen in the middle panel of Fig. 9, could be partly due to TEC variations caused by magnetic storms.The bottom panel of Fig. 9 shows the Ap index.Corresponding to the geomagnetic disturbances on 29 May, 17 September, 14 October, the observed TEC values negatively departed from the network predictions, as denoted by the asterisks.While on 18 August, the observed TEC value was much larger than the network prediction.On the other hand, the two largest storms on 29 October (the Halloween storm) and 20 November did not cause large TEC disturbances in the Japan's sector.In the February-April period, no clear correspondence was found between the large TEC discrepancy and magnetic disturbances.Thus the Ap index alone is not sufficient to predict storm effects on TEC, and incorporating storm effects into the model is a future problem.
Total performance of the constructed model over a year is shown in Fig. 10.The left panel is a scatter diagram of the hourly values of the network outputs for grid number 14 (35 • N, 137 • E) against corresponding TECs obtained from a data set similar to that used in network training, i.e. spherical harmonic functional fitting of three consecutive days (step  2) in 2003.Thus, this gives the performance of network learning only (step 3).In this comparison, the root mean square error (RMSE) was 3.29 TEC units.The right panel is a scatter diagram of the predicted hourly values against the grid TEC (step 1) at the same grid point.The RMSE is slightly higher than that for the functional fitting results and was 4.52 TEC units, because the functional fitting averaged unpredictable day-to-day variability, including magnetic disturbance effects, over three days.

Summary
A large amount of GPS-derived total electron content data has been collected in this decade covering almost one solar cycle, which now allows for an empirical model of TEC to be constructed.We constructed a regional reference TEC model over Japan based on the dense GPS receiver network, GEONET.The process consisted of three steps: (1) determining vertical TECs at grid points separated by 2 • in latitude and longitude (gTECs), (2) approximating by using surface harmonic functional fitting (time-latitude maps), and (3) using neural network mapping to relate the solar activity and season with the pattern of the time-latitude map.In the first step, instrumental biases were simultaneously determined and vTECs were averaged in 2×2 • longitude/latitude cells.Averaging and smoothing were also performed in step 2 by using limited degree and order of the surface harmonic function to approximate gTEC over three days.Step 3 successfully worked to separate the solar activity and seasonal dependences of the TEC distribution pattern with respect to time and latitude.

Fig. 2 .
Fig. 2. Observed daily solar flux (upper trace with left scale) and daily sunspot number (lower trace with right scale) from April 1997 to June 2007. Figure3

Fig. 3 .
Fig. 3. Seasonal distribution of observed daily solar flux (upper panel) and daily sunspot number (lower panel) from April 1997 to June 2007.

Fig. 5 .
Fig. 5. Local time-latitude distribution of TEC; (a) predicted by trained NN for March equinox in 2003 and (b) mean TEC averaged over a 27-day period centered on March equinox 2003 constructed using gTEC data.

Fig. 10 .
Fig. 10.Total performance of NN prediction.Scatter plots of NNpredicted hourly TEC for 2003 vs. TEC after functional fitting (a) and grid TEC before functional fitting (b).