Comparison of GNSS integrated water vapor and NWM reanalysis data over Central and South America

We compared and analyzed data of vertically Integrated Water Vapor (IWV) from two different re-analysis models (ERA-Interim from ECMWF and MERRA-2 from NASA’s Global Modeling and Assimilation Office) with respect to IWV values from Global Navigation Satellite Systems (GNSS) at 53 stations of Central and South America during the 7-year period from January 2007 till December 2013. The comparison was performed taking into account the geopotential height differences between each GNSS station and 5 the correspondent values assigned by the models. Thus, the set of GNSS stations was divided into 3 groups: Small, Large and Critical height difference stations. Moreover, the performance of the re-analysis models was also analyzed by using an additional classification of three levels according to the mean IWV (IWV ) value expected at the station: IWV > 30 kg m−2, 12 kg m−2 6 IWV 6 30 kg m−2 and IWV < 12 kg m−2. Both models (IWVERA−Interim and IWVMERRA−2) offered a very good representation of the IWV from GNSS values 10 (IWVGNSS) for stations with a Small height difference (smaller than 100 meters). That is to say, the differences between the mean values of IWV from GNSS (IWV GNSS) with respect to the IWV averages from both re-analysis models are always below 7 % of the IWV GNSS in the worse case. In general, the discrepancies between the re-analysis models with respect to IWVGNSS raise as the geopotential height difference between the GNSS station and the static geopotential height interpolated from the models grows. Effectively, the 15 difference between IWVGNSS and IWV from the re-analysis models can be as large as 10 kg m−2 for stations with a critical height difference (larger than 500 meters). For this reason, we proposed a numerical correction that compensates the effect of the geopotential height difference and the results were tested with values from ERA-Interim. The suggested correction was successful and reduces the differences |IWVGNSS − IWVERA−Interim| to less than a 7 % of the mean IWVGNSS values. This strategy is especially recommended for stations that were classified as Critical, most of 20 them located in mountainous areas of South America. In the case of Large height difference stations, the correction procedure is not advisable either for a coastal station and/or stations in islands. Generally in those cases, two or more grid point are on the water. Thus, the interpolated IWV value for the re-analysis model will be overestimated. At one hand, if the geopotential height of the model is smaller than the geopotential height of the GNSS station, the subtracting numerical correction would compensate this overestimation of IWV near the water and thus the strategy will represent an improvement. On the other 25

model, the standard deviation of the difference RCA-GPS resulted 3 times larger than the subtraction ERA-Interim minus GPS.The IWV difference for individual sites varies from -0.21 up to 1.12 kg m −2 and the corresponding standard deviation is 0.35 kg m −2 .In this work, the authors also highlight that the models overestimate IWV for sites near the sea.Bordi et al. (2014) studied global trend patterns of a yearly mean of IWV from ERA-20CM and ERA-Interim.The authors highlight a regional dipole pattern of inter-annual climate variability over South America from ERA-Interim data.According to this study, the Andean Amazon basin and Northeast Brazil are characterized by rising and decreasing water content associated with water vapor convergence (divergence) and upward (downward) mass fluxes, respectively.Besides, the authors also compared IWV from ERA-Interim with the values estimated at 2 GPS stations in Bogotá and Brasilia.Such comparison on monthly timescale made known a systematic bias attributed to a lack of coincidence in the elevation of the GPS stations and the model grid points.Tsidu et al. (2015) presented a comparison between IWV from a Fourier Transform InfraRed spectrometer (FTIR, at Addis Ababa), GPS, radiosondes, and ERA-Interim over Ethiopia for the period 2007-2011.The study is focused on the characterization of the different error sources affecting the data time series.In particular, from the study of diurnal and seasonal variabilities, the authors addressed differences in the magnitude and sign of IWV bias between ERA-Interim and GPS.They linked this effect with the sensitivity of the convection model with respect to the topography.Wang et al. (2015) performed a 12-year comparison of IWV from 3 third generation atmospheric reanalysis models including ERA-Interim, MERRA and the Climate Forecast System Reanalysis (CFSR) on a global scale.IWV values from the reanalysis models were also compared with radiosonde observations in land and Remote Sensing Systems (RSS) on satellites over oceans.
The authors asseverate that the main discrepancies of the 3 datasets among them are in Central Africa, Northern South America, and highlands.
In this paper, we investigate the differences between IW V GN SS resulted from a geodetic process of (GPS + GLONASS) data collected during more than 5 years in South America (Bianchi et al., 2016a) and IWV values given by ERA-Interim and MERRA-2.The comparison was performed taking into account the geopotential height differences (∆Z) between each GNSS station and the correspondent height values assigned by the models.Provided that both models showed a very good representation of the IWV values for stations with a Small height difference, we used this set of stations with ∆Z smaller than 100 meters, to deeply analyze the expected seasonal behavior according to the inter-annual mean of IWV from GNSS expected at the station.In order to take into account the differences found in IWV values from the models at stations with ∆Z larger than 100 m., we proposed a numerical correction.The strategy was tested for ERA-Interim re-analysis model and it shows to be successful.Section 2 describes the different sets of data used in this study.Follows the explanation of the methodology and the presentation of the results obtained after applying the proposed correction to IWV values from ERA-Interim.

IWV from GNSS
In this study, the GNSS data is the main source of information for the spatial and temporal distribution of water vapor.Thus, the main variable considered is the IWV estimated from the delay caused by the troposphere to the GNSS radio signals during its travel from the satellite to the ground receiver.The total delay projected onto the zenith direction (ZTD) is usually split into two contributions: the hydrostatic delay (ZHD, Zenith Hydrostatic Delay) depending merely on the atmospheric pressure and the Zenith Wet Delay (ZWD) depending mainly on the humidity.Finally, IW V GN SS can be obtained from ZWD multiplying it by a function of the mean temperature of the atmosphere.
The reference database of IW V GN SS (GPS + GLONASS) used in this study come from a geodetic process over 136 tracking stations in the American Continent placed from southern California to Antarctica, during the 7-year period from January 2007 till December 2013 (Bianchi et al., 2016b).Specifically, the data series of IW V GN SS used in this study is restricted to those 69 stations with IWV time series spanning more than 5 years.
The GNSS observations were processing at a double-difference level with the Bernese GNSS Software 5.2 (Dach et al., 2015) where all the models and conventions employed are recommended by the International Earth Rotation and Reference Systems Service (IERS).The geodetic process used Vienna Mapping Function 1 (VMF1) (Boehm et al., 2006).The ZTD were represented as 30-minutes linear piecewise estimates and compared with three solutions contributing to the International GNSS Service (IGS) for the repro2 reanalysis.The comparison of ZTD results shows the expected consistency between estimations from the homogeneous but independent analysis.Afterward, to achieve IW V GN SS estimations, it is necessary to subtract the modeled ZHD from the ZTD data in order to obtain ZWD.ZHD are computed following Davis et al. (1985) and considering observed pressure measurements from nearby GNSS stations.Finally, the IW V GN SS values every 30 minutes are obtained from ZWD by using the proportionality constant from Askne and Nordius (1987).More details of the ZTD geodetic processing and the steps to obtain the IWV values are at Bianchi et al. (2016a).

IWV from NWM
The values of columnar Integrated content of Water Vapor (IWV) as reanalysis products from ERA-Interim (Dee et al., 2011) and MERRA-2 (Gelaro et al., 2017;Bosilovich et al., 2015) were evaluated in this study.The horizontal resolutions are 0.25 • × 0.25 • for ERA-Interim and 0.625 • × 0.50 • for MERRA-2, respectively.Because ERA-Interim data is given 4 times a day, in order to perform the comparison and even if MERRA-2 gives hourly data, we pick up IWV data from MERRA-2 every 6 hours at 0, 6, 12 and 18 hours of Universal Time.
ERA-Interim is the global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts (ECMWF).It covers the period from 1979 up today and supersedes the ERA-40 reanalysis.ERA-Interim address some difficulties of ERA-40 in data assimilation mainly related to the representation of the hydrological cycle, the quality of the stratospheric circulation, and the consistency in time of reanalyzed geophysical fields (Dee et al., 2011).
MERRA-2 is the successor of The Modern-Era Retrospective Analysis for Research and Applications (MERRA) from NASA's Global Modeling and Assimilation Office (Rienecker et al., 2011).MERRA-2 represents a quality improvement compared with MERRA because of the trends and jumps linked to changes in the observing systems.Additionally, MERRA-2 assimilates observations not available to MERRA and reduces bias and imbalances in the water cycle (Gelaro et al., 2017).
To this application we used the gridded values of the vertical Integral of Water Vapor (IWV) from both re-analysis models.
Because the comparison is performed at each GNSS station, a bilinear interpolation of each gridded data set was performed.In addition, we use values of air temperature (T ) and specific humidity (q) from ERA-Interim for the calculation of the correction to the IWV values.Both, q and T , are given in 37 levels of atmospheric pressure from 1000 to 1 hPa.

Stations classification criteria
Even when both reanalysis model give gridded values of the vertical integral of the water vapor, the solution provided by each model is linked to its respective geopotential surface invariant.Usually, IWV values are interpolated from the original grid by applying bilinear interpolation.Nevertheless, elevation differences between geopotential height from each model grid and GNSS height must be addressed.Effectively, if the height of a given point from a model is located lower than the position of the receiver, the model integrates a larger column of water vapor and the opposite if the model locates upper than it.
We performed the present comparison establishing a selection criterion according to the difference of geopotential height (Z) between each reanalysis model and the GNSS height at the station.In order to compute the geopotential height of the GNSS stations (Z GN SS ) we followed Van Dam et al. (2010) algorithm.First we obtained the orthometric height at each GNSS station by correcting the ellipsoidal height with the EGM08 model (Pavlis et al., 2012).For a given GNSS station, the respective geopotential height from each of the 2 reanalysis models resulted from a bilinear interpolation of each respective gridded dataset.
Table 1 shows the geodetic coordinates as well as the climate classification of Köppen-Geiger (K-G) (Peel et al., 2007) and the |∆Z| classification for both models.Subsequently, we selected the common stations that address the adopted criteria simultaneously in both NWM.Thus the original set of 69 stations is reduced to 53 stations.Figure 2 shows the 53 GNSS stations arrangement according to |∆Z| differences with respect to ERA-Interim.

Computation of the integral correction
Once we detected the cases in which the application of a correction is necessary, we proceed to describe the proposed integral correction.It will be calculated only for one of the two tested re-analysis models.
Zhu (2014) compare the results of several reanalysis projects with independent sounding observations recorded in the Eastern Himalayas during June 2010.Among all the reanalysis models, ERA-Interim and MERRA were included.The authors analyze temperature, specific humidity, u-wind, and v-wind between 100 hPa and 650 hPa.They found that ERA-Interim showed the best performance for all variables including specific humidity the key variable to produce the integrated water vapor.Even if we tested MERRA-2, which is an improvement of MERRA, ERA-Interim is having a smaller grid.Thus, following Zhu (2014) criteria and taking advantage of a thinner grid, we used air temperature (T ) and specific humidity (q) on 37 pressure levels from ERA-Interim data to test the proposed correction.Following we describe how this correction is computed.
The starting data are the GNSS geopotential height (Z GN SS ) that is set as a reference, and the value of the geopotential height from ERA-Interim (Z model ) obtained after a bi-linear interpolation.According to our classification, these two values are not the same but may differ several hundred meters.Because the geodetic coordinates (φ, λ, h) of the GNSS station are known, we can compute the respective geopotential height as (Van Dam et al., 2010) where g 0 = 9.80665 m s −2 is the normal gravity at 45 • latitudes, the ellipsoidal height (h) is referred to the ellipsoid WGS84 and thus the radius of the ellipsoid at geodetic latitude φ is, (3) with a = 6378137m.and b = 6356752.3142m.are the semimajor and semiminor axis of the WGS84 ellipsoid, respectively (Hofmann-Wellenhof and Moritz, 2006).Moreover, the value of the gravity on the ellipsoid at geodetic latitude φ can be written as (Van Dam et al., 2010).
with e 2 = 0.00669437999014 is the first eccentricity squared of the WGS84 ellipsoid and g E = 9.7803253359m s −2 is the normal gravity at the Equator (Hofmann-Wellenhof and Moritz, 2006) and k s = 0.001931853 (Van Dam et al., 2010).
Afterward, the expression of the pressure at the geopotential height (Z) with respect to a given reference level is (Van Dam et al., 2010) where T 0 and p 0 refer to the temperature and pressure values at a reference level, R = 287.04J kg −1 • K is the gas constant and λ = 0.006499 • K m −1 is the lapse rate of the temperature, and δZ is the geopotential height difference between Z and the reference level.
Accordingly, given a Z GN SS at each instant, we have to look for the immediate upper geopotential height level from ERA Interim among the 37 available levels.We should consider that at any time the pressure value of each level is constant but it does not necessarily happen the same with the geopotential height.
Let suppose that this level is 27 that corresponds to 750 hPa. Figure 3 illustrates the example.The value of IWV provided by ERA-Interim is the result of the numerical integration of the expression (Berrisford et al., 2011).
where g 0 is the standard acceleration of the gravity at mean sea level, q(p) is the specific humidity of the air at the pressure level p and the integral is calculated from the first level (p 1 ) up to the model surface level (p s ), i.e. up to the static geopotential height (Z model ) that corresponds to the station.
Therefore, by using temperature and specific humidity values given at the 2 layers above and below the point of interest, we have to interpolate T and q at the GNSS geopotential level (Z GN SS ).Because the pressure value at Z model is not necessarily coincident with one of the given levels, we could also extrapolate T and q in the same way for Z model .
Finally, the ∆IW V is computed as the numerical integral of Eq. ( 6) between the pressure values at Z model and at Z GN SS .
This quantity could be additive if Z GN SS < Z model or subtractive if opposite.

Results
The Table 2 shows the inter-annual IWV mean values for the 53 stations of the reduced subset that fulfill the station´s selection criteria by using the |∆Z|, i.e. (Z GN SS −Z N W M ).IWV inter-annual averages were computed for GNSS (IW V GN SS ) as well as for both NWM (IW V ERA−Interim and IW V M ERRA−2 ) .Note that MERRA-2 values could be a little more dispersive because of the coarser grid.However, the correlation coefficients between IW V GN SS values and the respective ones for both NWM, are higher than 0.95 in most of the cases.

Analysis of the efficiency of the re-analysis models
In order to analyze the performance of ERA-Interim and MERRA-2, we compared both mean inter-annual averages of IWV Regarding Table 2 for Small |∆Z| stations, and focusing on ERA-Interim, the subtractions of IW V GN SS minus IW V ERA−Interim have different signs but they are smaller than 3 kg m −2 but RNNA station where it reaches 3.5 kg m −2 .On the other hand, the differences between (IW V GN SS − IW V M ERRA−2 ) never surpass 3.5 kg m −2 .Moreover, generally IW V M ERRA−2 resulted larger than IW V GN SS and that overestimation of MERRA-2 can be seen despite the sign of |∆Z|.
In general for stations classified as Small, IWV mean values from ERA-Interim are closer to mean values from GNSS than MERRA-2.Moreover, the IW V N W M disagreement from GNSS values is about a 7 % of IW V GN SS for stations with IW V > 30 kg m −2 and it remains in 7 % for stations with 12 kg m −2 IW V 30 kg m −2 .Furthermore, there is only one station that fulfill the condition IW V < 12 kg m −2 and its maximum discrepancy is with MERRA-2 reaching 6% of the IW V GN SS .
Among Large |∆Z| stations the situation also depends on the IW V expected.Thus, when IW V > 30 kg m −2 the disagreement of MERRA-2 reaches 15% of IW V GN SS while for ERA-Interim it is about 9 %.In the case of stations that fulfill the condition 12 kg m −2 IW V 30 kg m −2 , the discrepancies could reach up to 35 % and for stations with IW V < 12 kg m −2 the disagreement of both models with respect to IW V GN SS is below 40 % of this amount.For Critical |∆Z| stations the discrepancies of the NWM with respect to GNSS can reach 55% of the IW V GN SS .
In general, we can observe that the percentages of model failures grow as the height differences (∆Z) become larger.All of the above, we asseverate that the disagreement is the greatest for the stations classified as Critical.
Thus, provided that both models showed a very good representation of the IWV values for stations with a Small height difference, we will focus on such stations to analyze the seasonal behavior of each NWM with respect to IW V GN SS .The objective is to distinguish a systematic lack of agreement between NWM and GNSS, if there are any.
Figure 4 shows the seasonal stacked ∆IW V for both models.Three cases among the Small height difference stations are shown as an example for IW V > 30 kg m −2 (BELE), 12 kg m −2 IW V 30 kg m −2 (LPGS) and IW V < 12 kg m −2 (FALK).At BELE the differences from MERRA2 are always larger than the ones from ERA-Interim.Such differences also have a different sign indicating that ERA-Interim always underestimates IW V GN SS but it hardly exceeds 3 kg m −2 , while MERRA2 always overestimate IW V GN SS and the disagreement could reach 3.5 kg m −2 .For LPGS both NMW overestimate within 1 kg m −2 .Finally at FALK station both re-analysis models overestimate the inter-annual seasonal mean of IWV from GNSS although MERRA-2 values are always larger than ERA-Interim ones.As we said before, even though such a difference never exceed 1 kg m −2 , that represents about 10 % of the total amount because IW V < 12 kg m −2 .

Application of the integral correction
From the analysis of the behavior of the Small height difference stations, we can see that both NWM represent IWV from GNSS better than a 7% of the expected values in the worse case.Thus, we propose to compute a correction to the IWV values from ERA Interim only for stations classified as Large and Critical.Such a compensation have to be added (or subtracted) to the given IW V ERA−Interim values considering the sign of the height differences.Accordingly, the proposed correction will be calculated as the numerical integration of the specific humidity (q) between the geopotential height from ERA-Interim and the geopotential height of the GNSS station (see Section 3.2).
Figure 5 shows the application of the before mentioned correction procedure on two Critical height difference stations: BOGT in Bogotá, Colombia, and SANT in Santiago de Chile, Chile.These stations are selected because their ∆Z is having a different sign.As expected both curves IW V GN SS (blue solid line) and IW V ERA−Interim (green solid line) are not coincident.In the case of BOGT, ∆Z is positive, that means that GNSS station is higher to the location assigned by ERA-Interim.
Accordingly, the model integrates a thicker layer of atmosphere and thus IW V ERA−Interim values resulted larger than ones from IW V GN SS .The opposite can be seen in SANT. Figure 5 also shows us an improvement of the agreement with respect to IW V GN SS when we add the correction to the values of the IW V ERA−Interim (red dashed line).
Figure 6 shows the residuals with and without applying the integral correction.We can see that the differences (IW V GN SS − IW V ERA−Interim ), which can reach up to 10 kg m −2 , are reduced to an order of magnitude of their respective value of IW V GN SS (solid black line).
However, the application of this correction in the case of stations classified as Large should be more precautionary.This set of stations showed a heterogeneous behavior and include some cases where the application of the correction not only is unnecessary, but it can make the differences (IW V GN SS −IW V ERA−Interim ) even larger.Effectively, in these cases different shortcomings of the model overlap the height problem and therefore the proposed correction does not work.As an example of this we can mention the case of coastal and/or insular stations where 2 or more grid points will be in the ocean.In all these cases the value of IWV calculated from the bilinear interpolation will be overvalued.Let's analyze in detail the case of stations near the seashore (for example PARC in Punta Arenas, Chile) where 2 of the 4 grid points are in the ocean (see Figure 7).Also ∆Z = -117.12m in PARC indicating that the geopotential height from ERA-Interim is larger than the GNSS geopotential height and therefore the proposed correction will be additive.Besides this result, the IW V ERA−Interim resulted over-estimated by applying a bilinear interpolation that uses data points in the ocean.In conclusion, the value (IW V ERA−Interim + correction) will result larger than the IW V GN SS value that you intend to estimate.Thus, this is an example where applying the suggested correction may worsen the results.

Discussion and Conclusions
In this work, we analyzed the discrepancies between the vertically Integrated Water Vapor values provided by two re-analysis models (ERA-Interim and MERRA-2) with respect to the IW V GN SS values taken as a reference in the South and Central American continent.We performed the comparison establishing a selection criteria according to the difference of static geopotential height (∆Z) between GNSS and each reanalysis model at the station.
Several authors had been reported problems related to the elevation correction for data from the reanalysis models.The artificial bias in IWV introduced by this altitude difference was previously reported by Bock et al. (2007);Van Malderen et al.
(2014); Bordi et al. (2014) and Bianchi et al. (2016a).Moreover, this effect can also affect other variables.For instance, Gao et al. (2012) studied the height corrections for the ERA-Interim 2m-temperature data at the Central Alps and they also found large biases that must be corrected in mountainous areas.
For the above, an integral correction was proposed that compensates the effect of the geopotential height difference between GNSS and the interpolated grid point in the reanalysis model and the results were tested with the respective ones from ERA-Interim.The correction is computed as the numerical integration of the specific humidity where the integral limit is a pressure difference at δZ (see Eqs. 5 and 6 ).
For the Small height stations MERRA-2 mostly exhibits the larger discrepancies, i.e.IW V GN SS − IW V M ERRA−2 > IW V GN SS − IW V ERA−Interim , and this could be a consequence of a coarser horizontal grid used to the bilinear interpolation of data.Moreover, MERRA-2 generally overestimates Both for Small and Large |∆Z| stations the discrepancies between the NWM and GNSS can be analyzed depending on the IW V expected, but anyway the differences rise as the |∆Z| grows.For IW V > 30 kg m −2 the disagreement of the NWM with respect to GNSS is 7% for Small |∆Z| stations but it rise up to 15 % of IW V GN SS for Large stations.If 12 kg m −2 IW V 30 kg m −2 , the disagreement of the NWM goes from 7% for stations classified as Small up to 35% for Large |∆Z| stations.Finally, for IW V < 12 kg m −2 the percentage of disagreement is always lower than 40 % of IW V GN SS in the worse case,i.e. for Large |∆Z| stations.
For Critical |∆Z| stations the discrepancies of the IWV from NWM with respect to IWV from GNSS can reach 55% of the expected values.
All of the above, we proposed the numerical correction only for the Large and Critical stations.The suggested improvement was successful reducing the differences between IW V GN SS and IW V ERA−Interim from typical values of 10 kg m −2 to an order of magnitude of their respective value of IW V GN SS .The correction is especially recommended for stations that were classified as Critical, most of them located in mountainous areas of South America.

Figure 1 .Figure 2 .
Figure 1.Example of geopotential height differences used to classify GNSS stations.ZNW M results from a bi-linear interpolation of the gridded data.A, B, C and D are the four grid points of the NWM around the GNSS station.

Figure 3 .
Figure 3. Scheme of the applied correction to the IWV from ERA-Interim reanalysis.

Table 2 :
Inter-annual mean of IWV (IW V * in [kg m −2 ]) for stations classified as Small, Large and Critical height difference.SD refers to the standard deviation.∆Z [m.] refers to the difference between the geopotential height of the GNSS station and the bi-linear interpolated value of the geopotential height from each NWM.

Table 2 :
Inter-annual mean of IWV (IW V * in [kg m −2 ]) for stations classified as Small, Large and Critical height difference.SD refers to the standard deviation.∆Z [m.] refers to the difference between the geopotential height of the GNSS station and the bi-linear interpolated value of the geopotential height from each NWM.