Earth's ionosphere is an important medium of radio wave propagation in modern times. However, the effective use of the ionosphere depends on the understanding of its spatiotemporal variability. Towards this end, a number of ground- and space-based monitoring facilities have been set up over the years. The information from these stations has also been complemented by model-based studies. However, assessment of the performance of ionospheric models in capturing observations needs to be conducted. In this work, the performance of the IRI-2016 model in simulating the total electron content (TEC) observed by a network of Global Positioning System (GPS) receivers is evaluated based on the RMSE, the bias, the mean absolute error (MAE) and skill score, the normalized mean bias factor (NMBF), the normalized mean absolute error factor (NMAEF), the correlation, and categorical metrics such as the quantile probability of detection (QPOD), the quantile categorical miss (QCM), and the quantile critical success index (QCSI). The IRI-2016 model simulations are evaluated against gridded International Global Navigation Satellite System (GNSS) Service (IGS) GPS-TEC and TEC observations at a network of GPS receiver stations during the solar minima in 2008 and solar maxima in 2013. The phases of modeled and simulated TEC time series agree strongly over most of the globe, as indicated by a high correlations during all solar activities with the exception of the polar regions. In addition, lower RMSE, MAE, and bias values are observed between the modeled and measured TEC values during the solar minima than during the solar maxima from both sets of observations. The model performance is also found to vary with season, longitude, solar zenith angle, and magnetic local time. These variations in the model skill arise from differences between seasons with respect to solar irradiance, the direction of neutral meridional winds, neutral composition, and the longitudinal dependence of tidally induced wave number four structures. Moreover, the variation in model performance as a function of solar zenith angle and magnetic local time might be linked to the accuracy of the ionospheric parameters used to characterize both the bottom- and topside ionospheres. However, when the NMBF and NMAEF are applied to the data sets from the two distinct solar activity periods, the difference in the skill of the model during the two periods decreases, suggesting that the traditional model evaluation metrics exaggerate the difference in model skill. Moreover, the performance of the model in capturing the highest ends of extreme values over the geomagnetic equator, midlatitudes, and high latitudes is poor, as noted from the decrease in the QPOD and QCSI as well as an increase in the QCM over most of the globe with an increase in the threshold percentile TEC values from 10 % to 90 % during both the solar minimum and the solar maximum periods. The performance of IRI-2016 in simulating observed low (as low as the 10th percentile) and high (higher than the 90th percentile) TEC correctly over equatorial ionization anomaly (EIA) crest regions is reasonably good given that IRI-2016 is a climatological model. However, it is worth noting that the performance of the IRI-2016 model is relatively poor in 2013 compared with 2008 at the highest ends of the TEC distribution. Therefore, this study reveals the strengths and weaknesses of the IRI-2016 model in simulating the observed TEC distribution correctly during all seasons and solar activities for the first time.
Radio waves have become an indispensable and spectacular tool in the progress of space satellite communication and navigation, and the Earth's ionosphere
is an essential medium for the propagation of radio wave signals
Historically, as an early part of efforts to understand the Earth's upper ionosphere, the first satellites' radio measurements recorded some crucial results with respect to
the upper atmosphere. Measurements of the orbital period using radio locations revealed that temperature and density show high discrepancies in the upper
atmosphere
IRI, one of the empirical modeling tools currently available to the wider scientific community, portrays the spatial and temporal variability of the ionosphere for a
specific solar variability
Important advancement of the IRI-2016 model version has been made based on ground- and space-based observations (e.g., ionosonde and radio occultation). The major changes
include two new model options for the F2-peak height
Moreover, the RMSE,
bias, and MAE, which are absolute measures, may not be suitable when comparing quantities with different orders of magnitude of background values. As a result, the comparison of model
performance in simulating TEC during solar maximum and minimum periods based on these metrics alone is a serious concern due to the fact that the background TEC during a solar
maximum is higher than that of a solar minimum period by a large margin. In this case, relative
measures are usually preferred when comparing the performance of models. Traditionally, most relative differences are normalized by
the observed quantities. Nevertheless, there are also concerns associated with this approach to normalization that can result in misleading conclusions. These concerns
are asymmetry and the inflation of relative metrics. The values can be greatly inflated by a few instances in which the observed quantity in the denominator of the expression is
quite low relative to the bulk of the observations. Recently, these cases have prompted the definition of new, symmetric, unbiased metrics of model performance – namely the
normalized mean bias factor (NMBF) and the normalized mean absolute error factor (NMAEF) –
that may be suitable for the evaluation of the skill of models
Therefore, this paper focus on the comprehensive global validation of the IRI-2016 model regarding its skill to simulate seasonal and annual TEC variations observed
by a network of ground-based GPS receivers run by the International Global Navigation Satellite System(GNSS) Service (IGS) using common statistical metrics, recently defined
relative metrics (e.g., NMBF, NMAEF), and quantile-based categorical metrics. To our knowledge, there is no comprehensive and global evaluation of IRI-2016 that
includes detailed analysis at the tails of the TEC distribution and addresses problems that arise in the evaluation of
model performance in predicting time series with different scales of background TEC. The paper is organized as follows: Sect.
The TEC data extracted at a grid resolution of 5
TEC data are simulated using IRI-2016 as function of universal time and geographical grids; this matches the spatiotemporal grids of observed IGS
GPS-TEC for the 2 selected years. Moreover, the model is used to generate IRI-TEC at the sites of the individual GPS receivers shown in Fig.
Distribution of the GPS receiver stations used to derive the ungridded TEC used in the model evaluation in addition to the IGS GPS-TEC.
The Dst index represents the axially symmetric disturbance magnetic field at the dipole equator on the Earth's surface. Major disturbances in Dst are
negative, representing a decrease in the geomagnetic field. Therefore, days with a Dst value greater than
The comparison of TEC from the IRI-2016 model with GPS measurements is evaluated based on the RMSE, the bias, the MAE and its skill
score (MAE
Furthermore, it is suitable to define model skill with respect to the reference model. To determine the model skill in connection with the
reference model, the MAE for the reference model is first calculated with respect to the observations as follows:
The NMBF is computed from
It is important to assess whether
the model captures the diurnal and seasonal cycles of observed TEC in addition to the agreement in the TEC values between the model and the observations. The Pearson correlation (
The categorical statistics employed in this study aim to evaluate the extent to which the simulation captures the distribution of the observed GPS-TEC above certain
selected thresholds. As the IRI model is an empirical model based mainly on past observations, it is natural to expect that its performance at the extreme ends of the
observed distribution may suffer from large inaccuracies. However, the extent of this discrepancy at the extreme ends of the observed TEC distribution is not assessed fully. Therefore, categorical statistics such as the QPOD, QCM, and QCSI are employed to evaluate the performance of the IRI-2016 model in simulating the whole range
of observed distributions from extremely low to extremely high TECs. The QPOD defines part of the observations (
The QCM may be defined as 1
There are also other categorical metrics (e.g., QFAR) to assess model performance, but only the abovementioned metrics are used for brevity. Both of these numerical (continuous) and categorical statistics are used to assess the model skill in capturing the individual observations within the selected calendar months of March, June, September, and December in the case of the comparison between IRI-2016 and the GPS receiver station-level TEC and within seasons in the case of the comparison between the gridded IGS GPS-TEC and the IRI-2016 model during the solar minimum (2008) and maximum (2013; taken from window of solar maximum years 2012–2014).
Figure
The correlation, RMSE, bias, MAE, and MAE
The IRI-2016 TEC is biased high by up to 10 TECU over tropics with respect to GPS-TEC during 2008, whereas it is almost 2-fold higher in 2013 over the same region. A small negative
bias is observed poleward of approximately 25
The same as Fig.
The skill score of the model with respect to the reference model that simulates mean TEC is assessed based on the MAE. The result shows that the IRI-2016 model performs
better than the reference model at a few latitudes in the tropics (Fig.
Latitudinally averaged measurement errors during the calendar months shown in the legend for the two solar activity periods.
Figure
The significance of the departure of the modeled TECs from observations can also be appreciated in the context of GPS observation errors. Figure
Figures
Scatter plots of TECs from IRI-2016 and GPS at selected longitudes for the solar minimum in 2008
Figure
Overall, the IRI-2016 TEC at the four selected longitudes is
biased low (0.9 to 1.4 TECU) against GPS-TEC during 2008 with the exception of 30
The same as Fig.
Moreover, the comparison between the model and the gridded GPS-TECs is extended to include the whole global region as given in Fig.
The bias, RMSE, MAE-based skill score, and correlations between IRI-2016 and the gridded GPS-TECs as a function of daytime solar
zenith angle
In previous sections, the comparisons were based on either individual data within a given calendar month (Sect.
The seasonal mean TEC obtained from the gridded IGS GPS-TEC (the eight left panels) and simulated by IRI-2016 (the eight right panels) for the March equinox (M-Eq), the June solstice (J-Sol), the September equinox (S-Eq), and the December solstice (D-Sol) during the solar minima year 2008 (top row) and the solar maxima year 2013 (bottom row).
Correlation between IRI-2016 and IGS GPS-TEC for the four seasons during 2008
In order to appreciate the NMBF and NMAEF values between the model and IGS GPS-TECs, the observed seasonal
mean TECs and mean simulated TEC during the 2008 and 2013 seasons are given in Fig.
In 2008 (the solar minimum period), the correlations are generally high in all seasons over most of the globe with the exception of the March and September equinoctial months
and the June solstice over the southern Atlantic Ocean, the Pacific Ocean, and the polar regions, which had a low correlation between IRI-TEC and GPS-TEC (Fig.
The same as Fig.
The same as Fig.
Generally, the NMBF implies that the IRI-2016 overestimates the observed TEC by a maximum factor of 1.4 within the
The same as Fig.
Figure
Figure
The QPOD, QCM, and QCSI categorical metrics at 90 % for 2008
As noted in Sect.
Figure
The QPOD of the IRI-2016 model for the four seasons evaluated at the 10th, 25th, 75th, and 90th percentiles of the TEC distribution against GPS-TEC during the solar minima in 2008 (left panels) and the solar maxima in 2013 (right panels).
Statistical parameters of the QPOD for all seasonal variations of the solar minima in 2008 and the solar maxima in 2013.
In 2013 during the solar maximum year, the QPOD characteristics are similar to that of 2008 for all the seasons but with a notable improvement at the lower ends of the distribution
for the two equinoctial seasons (see Figs.
Figure
The same as Fig.
Figure
The same as Fig.
Statistical parameters of the QCSI for all seasonal variations of the solar minima in 2008 and the solar maxima in 2013.
In this paper, the performance of the IRI-2016 model in simulating GPS-TEC is assessed by employing the RMSE, bias, MAE, NMBF, NMAEF, skill score, and correlation as well as categorical metrics such as the QPOD, QCM, and QCSI during two distinct solar activity periods. The IRI-2016 model simulations are based on the configuration that uses the latest developments.
The correlation between the model and individual measurements at GPS stations worldwide, averaged within a given geomagnetic latitude band, ranges from 0.1 to 0.9 in 2008 and from 0.3 to 0.91 in 2013. The RMSE generally decreases from the geomagnetic equator towards the poles with a few exceptions during both 2008 and 2013, implying that the IRI-2016 model exhibits poor performance in capturing observed TEC over the tropics. The IRI-2016 TEC is biased high by up to 10 TECU over the tropics with respect to TEC at the individual GPS receiver stations and averaged within the latitude bands during 2008. This figure is almost two-fold higher in 2013. However, the IRI-2016 bias over mid and high latitudes is negative with respect to the observed TECs at the GPS receivers sites, which is in agreement with some previous studies.
The skill score of the model with respect to the reference model that simulates mean TEC is assessed based on the MAE. The IRI-2016 model is found to perform worse than the reference model over high latitudes during the low (high) solar activity year 2008 (2013). Moreover, the model skill is worse than the reference climatological model during December (June) in the Southern (Northern) Hemisphere. The observed relatively good skill of the IRI-2016 model in the tropics and during low solar activity are in agreement with previous studies. Nevertheless, the traditional performance measures exaggerate the difference in the skill of the model during different solar activity periods as noted from minor differences in the NMBF and NMAEF during the two periods.
Investigation of the selected longitude sectors indicates that the IRI-2016 model is biased low at both the low and high tails of the TEC distribution, suggesting that IRI-2016 is capable of satisfactorily simulating the mean TEC globally. The longitudinal and seasonal variations in the performance of the IRI-2016 model can be explained in terms of the wave number four patterns as well as the difference in the solar insolation, neutral composition, and the direction of neutral meridional winds during these seasons. The dependence of the IRI-2016 performance on the solar zenith angle and magnetic local time reveals that the model requires further tuning of some of the ionospheric parameters used in the formulation of the bottom- and topside ionosphere. The extent of the IRI model strengths and weaknesses at the extreme portions of the observed TEC is assessed by categorical statistical metrics such as the QPOD, QCSI, and QCM using the 10th and 25th percentiles as lower margins and the 75th and 90th percentiles as upper margins of the TEC distribution for the two distinct solar activity periods. The performance of the IRI-2016 model based on individual GPS receiver measurements for the months of March, June, September, and December and gridded IGS GPS-TEC for the seasonal time series was evaluated using these thresholds. The model generally has reasonable skill at the low ends of TEC distribution over most of the globe. This skill weakens at the high ends of the TEC distribution over much of the globe except for EIA crest regions during both solar activity years. There is also hemispheric symmetry during the June and December solstices with poorer performance over the summer hemisphere at the high extremes of observed TEC. This feature is consistent with the high RMSE and low bias in the model during summer compared with winter. Similarly, the robust skill at the low ends of the observed TEC distribution can be attributed to the fact that low TECs which constitute the low portion of TEC distribution are mainly observed during nighttime, whereas those at the high ends of the distribution occur during daytime.
In summary, the IRI-2016 model, which itself is a climatological empirical model, has simulated a significant portion of the observed TEC over the tropics with better accuracy during both solar activity periods and the different seasons than a hypothetical model that only captures the seasonal mean TEC. The model performance at the extreme ends of the distribution is also remarkably good. In particular, the IRI model skill in detecting observed TEC over EIA crest regions at the extreme ends is robust despite the high RMSE. Therefore, this encouraging IRI-2016 model performance at the extreme ends of the observed TEC distribution suggests the importance of further work to improve the model so that it can be used for real-time operational forecasting.
Station-level GPS-TEC used in the study is available from the authors on request.
Conceptualization was by GMT and MMZ; investigation was by GMT and MMZ; data processing was done by GMT; the methodology was by GMT and MMZ; writing of the original draft was done by GMT and MMZ; writing, reviewing and editing were done by GMT.
The authors declare that they have no conflict of interest.
We are highly grateful to NASA for free access to the IRI-2016 model and GPS data. The authors would like to acknowledge the valuable input from Andrey Lyakhov and the anonymous reviewers. Moreover the second author extends his gratitude to Aksum University, Addis Ababa University, and the Botswana International University of Science and Technology (BIUST) for their financial support during second author's PhD study and research visit to BIUST.
This paper was edited by Dalia Buresova and reviewed by Andrey Lyakhov and two anonymous referees.