Assimilation of radar altimeter data in numerical wave models: an impact study in two different wave climate regions

Abstract. An operational assimilation system incorporating significant wave height observations in high resolution numerical wave models is studied and evaluated. In particular, altimeter satellite data provided by the European Space Agency (ESA-ENVISAT) are assimilated in the wave model WAM which operates in two different wave climate areas: the Mediterranean Sea and the Indian Ocean. The first is a wind-sea dominated area while in the second, swell is the principal part of the sea state, a fact that seriously affects the performance of the assimilation scheme. A detailed study of the different impact is presented and the resulting forecasts are evaluated against available buoy and satellite observations. The corresponding results show a considerable improvement in wave forecasting for the Indian Ocean while in the Mediterranean Sea the assimilation impact is restricted to isolated areas.


Introduction
Coastal and ocean areas are nowadays exposed to an increasing pressure from ship traffic, aqua culture, offshore exploration and tourism globally. On the other hand, improved knowledge and monitoring capabilities for coastal and offshore regions are highly required with respect to wind and wave climates. One of the most reliable ways to accomplish such a goal is the use of operational numerical wave models.
Correspondence to: G. Kallos (kallos@mg.uoa.gr) Indeed, a significant part of operational and research activities is based today on such models, to provide regional or global forecasts. However, the quality of the final outputs strongly depends on local characteristics, initial conditions, as well as atmospheric data used as input (Lionello et al., 1995).
A widely known development in this framework is the use of assimilation techniques for the correction of wave model initial fields. However, the lack of a sufficient number of wave observations was always a serious drawback for such efforts, not only operationally but also for the establishment of the necessary background information (Bouttier and Courtier, 1999). A recent advancement that contributes to the solution of this problem is the use of satellite records. Ocean wind and wave measurements from satellites, combined with global wave and atmospheric numerical models have changed the way of obtaining ocean wave information, for both operational and climatological purposes. Through data assimilation in operational numerical models, satellite sea-surface observations are contributing to improved shortterm wave forecasts. On the other hand, satellite wind observations assimilated into atmospheric models contribute indirectly, by improving the atmospheric forecast and, hence, the wind forcing into the wave models.
In this work, the results from the implementation of the assimilation techniques suggested by Breivik and Reistad, 1994 at the Norwegian Meteorological Institute, as utilized in the framework of the EU project ENVIWAVE 1 , are presented. The adopted method is based on a modification of the traditional successive correction methods (Cressmann, 1959) for the implementation of wave height measurements into numerical wave models. The satellite data used as input to the assimilation scheme are coming from the ENVISAT-ESA Radar Altimeter 2 (RA-2) instrument.
Previous work on assimilation of satellite data into numerical wave models can be found, among others, in Janssen et al. (1987), Lionello et al. (1992Lionello et al. ( , 1995, Greenslade (2001), Aarnes (2003), Abdalla et al. (2005), Skandrani et al. (2004), where the theoretical aspects as well as the general benefits of such methods are presented. The main aim of this paper is the investigation of the specific performance of an operational assimilation system in two different wave climate regions: the Mediterranean Sea, a wind sea dominated area with complicated topographic characteristics, and the Northern Indian Ocean (referred from now on as the Indian Ocean), an open swell-dominated sea. These differences in the domain characteristics result in a qualitatively different assimilation impact, both spatially and temporally. A second, but also important novelty of the present work is the use of high resolution atmospheric and wave models, which lead to analytical and detailed conclusions for each area of interest.
The paper is organized as follows: Sect. 2 briefly describes the wave model and the assimilation method. In Sect. 3, details of the domain characteristics and satellite data are presented. Analytical impact studies as well as statistical results of the assimilation scheme performance in both areas of interest are discussed in Sect. 4, while the main conclusions are summarized in Sect. 5.

Model description
The wave model used in this study is WAM cycle 4 (WAMDI group, 1988;Komen et al., 1994). This is a third generation wave model which solves the wave transport equation explicitly without any assumptions on the shape of the wave spectrum. It represents the physics of the wave evolution in accordance with our knowledge today for the full set of degrees of freedom of a 2-D wave spectrum. Details of the model configuration used in the present work are described in Sect. 3.
Corrected analysis fields for WAM are provided by an assimilation method developed by Breivik and Reistad (1994) at the Norwegian Meteorological Institute. The analysis scheme is based on the method proposed by Bratseth (1986), which is an improvement of the "classical" successive correction methods (Cressmann, 1959). The basic idea is the same as that for the so-called statistical interpolation (SI), which is widely used in atmospheric and wave models in several meteorological centers (e.g. ECMWF, Meteo-France, UK Met. Office, Hellenic National Meteorological Service, etc.). Bratset's scheme converges toward the results of SI by setting a proper choice of parameters for the analysis weights. This scheme is very cost efficient and easy to implement.
The analysis starts with the direct numerical model output for Significant Wave Height (SWH p ) as a first guess field. This is corrected by the use of any available corresponding observations (SWH o ). The method is based on the following two iterative equations: where a ij =(m ij +d ij )/M j , a xj =m xj /M j are the analysis weights. Here, subscripts i, j refer to observation points, x to grid points, superscripts O, P and A to observed, first guess, and analyzed values. N is the number of observations and k an iteration counter. The iterations start by setting (spatial interpolated, based on a bilinear method, applied only to analysis). The coefficients m ij and d ij are model and observation error covariances, respectively, which are assumed to be uncorrelated to each other. M j is a function of m ij and d ij , calculated for each observation, so that the above system of equations converges. More precisely, m ij are calculated by: where E P i is the standard deviation of the model first guess.
The parameter a P is assumed to be a P (r ij )= exp(−0.5·r 2 ij /b 2 ), where r ij is the distance between the observation points and b the horizontal scale of the first guess field error correlation determining the scale of the observation's influence. The observational error covariances have the form: where E O i is the standard deviation of the observations. In most of the cases, the matrix a O (r ij ) is assumed to be unitary, accepting in this way that the observations are unbiased and uncorrelated.
The described method leads to a corrected SWH field. To take advantage of this in the model, one needs to update the total wave model spectrum. This is attained by the division of the wave spectra into a swell and a wind sea part, which are scaled according to the analysed SWH. This separation is achieved by checking if the wave energy is generated by local winds or propagated from remote grid points. We note that the updating of the model spectra is based on the assumption that the first guess relation between windsea and swell is correct. However, we have to mention that this hypothesis is a limiting factor in altimeter wave height assimilation. Moreover, to avoid imbalance in the model, the wind at the analysis time is updated accordingly. A more detailed description of the method is given in Breivik and Reistad (1994).

The case study
In this work, WAM is used in two different wave-climate regions: the Mediterranean Sea, a wind sea dominated area, and the Indian Ocean, a swell dominated one. The two model domains are illustrated in Fig. 1. A global version of the model is used in order to initialize and nudge the lateral boundaries of the Indian Ocean domain. All the versions ran on high resolution set up, in a deep water mode without refraction, providing operationally daily forecasts. For the integration in the global domain, 28 frequencies and 24 directions were used, while the first integration frequency was 0.05 Hz and the propagation time step 1200 s. The same configuration holds for the nested domain of the Indian Ocean, with a propagation time step of 600 s. On the other hand, for the Mediterranean Sea, taking into account the shorter scale of the prevailing atmospheric and wave systems, the number of frequencies was limited to 25 and the first one was set to 0.1 Hz. The time step in this case was 300 s. Different atmospheric inputs (wind speed and direction 10 m above surface) were used for the above domains. The global model is driven by wind fields of the NCEP/GFS model that has a horizontal grid resolution 1.0×1.0 degrees. The nested Indian Ocean as well as the Mediterranean Sea use SKIRON/ETA weather prediction system (Kallos, 1997;Papadopoulos et al., 2001) which runs operationally once a day (with 12:00 UTC initial conditions) at the University of Athens, providing 5-day forecasts 2 . This system uses NCEP/GFS 1.0×1.0 degree resolution fields for the initial and boundary conditions. The necessary surface boundary conditions over the sea are interpolated from the 0.5×0.5 degree SST (Sea Surface Temperature) field analysis retrieved from NCEP on a daily basis. Vegetation and topography data are applied at a resolution of 30 s and soil texture data with resolution of 2 min. Further details of the models' configuration are summarized in the Table 1.
Both the atmospheric and wave models use the same high resolution in order to avoid interpolation issues and to ensure a better description of short scale atmospheric and wave phe-2 http://forecast.mg.uoa.gr  nomena (e.g. local coastal atmospheric circulations, island wave effects, etc.).
The assimilation scheme described in Sect. 2 was applied to both areas of interest. The number of iterations used was k=5 (as defined in Sect. 2) and the de-correlation radius b (grid distances) was defined to be 4. The maximum influence radius of assimilation was 8 grid points (this is only a practical limit, set when the correlation following this expression is close to zero). The maximum influence and decorrelation radius are compatible with the wave model resolution, since they are defined as grid point distances. In our case, these parameters are chosen so as to reflect the scales of the atmospheric forcing wind field at the sea surface and to be representative of the wave system characteristics prevailing and influencing each area. More precisely, according to the above definitions, for the Mediterranean Sea the decorrelation radius was 40 km and the maximum influence radius 80 km, www.ann-geophys.net/25/581/2007/ Ann. Geophys., 25, 581-595, 2007 where in the Indian Ocean they were 100 and 200 km, respectively. The standard deviation of the model was set to E p =1.0 m and the observation error to E 0 =0.5 m. These values were set based on the assumption that synoptic wind systems and the generated wave ones are of similar scale. This assumption is required due to the operational status of the present study.
Observations from the Ku-band of ENVISAT-ESA Radar Altimeter 2 (RA-2) were used in the assimilation cycle. This instrument determines the two-way delay of the radar echo from the Earth's surface to a very high precision (less than a nanosecond) with a footprint size of 1.7 km. In this way, the determination of wind speed and significant wave height is achieved. In particular, the significant wave height is calculated from the radar echo leading edge slope, which refers to the gradient of the leading edge of the echo. According to previous studies, the expected accuracy for the measurement of wave height is better than 0.25 m (see ESA SP-1224; Soussi, 2006). Some basic technical characteristics of this platform are presented in Table 2.
In both areas of interest the altimeter data were quality controlled according to Brattli (2003). More precisely, the basic filtering criteria used were the following: -Significant wave height observations should be between 0.5 and 20 m.
-Altimeter data standard deviation should not exceed a given interval (0.01-1 m).
-The difference between the model first guess and the RA2 record must be less than 5 m.
The restricted standard deviation interval (second criterion) ensures that extremely variable measurements will be excluded. The remaining variability of the altimeter data is desirable and necessary in order to better exploit the high resolution simulations.
For the Mediterranean Sea the average number of assimilated observations per forecasting cycle, passing this quality control, was between 250 and 300. The corresponding number for the Indian Ocean was 550-650. It is obvious that such an amount of "information" in open seas or oceans, where other types of observations are rare, is a valuable tool. The assimilation window in both areas was determined to be 12 h. More precisely, all observations available between 12:00-24:00 UTC are used. Note that the assimilation of these records is not taking place at a specific time. Since our system is running operationally, the assimilation procedure is active during the whole 12-h window, so that every recorded observation is assimilated at the closest model time step. In this way, corrected fields at 00:00 UTC are obtained which are used as analysis for the forecasting period from 00:00 to 12:00 UTC of the next day.

Results
The wave forecasting systems, described in the previous section, ran for a period of two months during November-December 2004 for the Indian Ocean and December 2004-February 2005 for the Mediterranean Sea. Some separate test cases were also performed in order to study the impact of the assimilation scheme under different weather conditions and will be described in the following. The first five days were used so as to reach a well established sea state.
The statistical analysis used in the present work was based on the following parameters: -Mean Difference and Root Mean Square Error of the Difference (RMSE) calculated between WAM, with and without assimilation, -Mean Difference, RMSE, Correlation Coefficient and Standard Deviation between the two versions of WAM (with and without assimilation) and satellite (RA2) measurements.
In this way, the quantitative impact of the assimilation in wave forecasts, as well as the quality of the analyzed wave fields, are estimated. The same statistical parameters were also employed for model evaluation against buoy observations.
Before proceeding with the discussion of the results, it is important to undeline that the two domains selected (the Mediterranean Sea and the Indian Ocean) are characterized by different physiographic characteristics, synoptic weather conditions and wave climatology. The long period waves (swell) are the dominant mode of the sea state in the Indian Ocean, since it can receive waves that have travelled even from Antarctica. On the other hand, in the Mediterranean Sea, wind-generated waves are significant and the wind regime in winter is more unstable. These facts are illustrated in Fig. 2, where the swell percentage of the total sea state, as well as the mean wave (spectral) period at each grid point for the experiment time, is presented. It becomes obvious that the maximum simulated mean wave period in the Mediterranean Sea is less than 6 s while this value is the minimum one for the Indian Ocean, where a typical wave period  fluctuates between 7-10 s. This difference between the two areas is further supported by the fact that the swell percentage in the Indian Ocean exceeds 80%, reaching in several cases even 100% of the total wave energy, whereas in the Mediterranean it is always less than 80%.
However, it has to be noted that during the boreal winter, for a large portion of the time, relatively steady northeasterly winds are prevailing in the northern Indian Ocean. This generates well developed windsea, which is more evident at the eastern coast of India. The model interprets such situations as swell. This fact creates problems in the identification of young windsea when swell is present (see Janssen et al., 2005;Bidlot et al., 2007) contributing also in the state represented at Fig. 2.
These different characteristics in the two domains, along with the fact that the assimilation analysis does not correct the wind input from the atmospheric model but directly the significant wave height, favors a limited influence of the assimilation scheme in the Mediterranean Sea (see also Lionello et al., 1992).
Concerning the initial relation between satellite data and the wave model with no data assimilation, it is worth noticing that it differs significantly in the two areas of interest. In particular, in the Mediterranean Sea, the corresponding linear correlation is significantly higher than that of the Indian Ocean. However, significant wave height is strongly underestimated at high values in the first area (Fig. 3). This is a reason that potentially may limit the impact of the assimilation scheme in the first domain. Indeed, the corresponding results for the wave model with the assimilation module activated show a trivial influence in the Mediterranean area (cf. Table 3). Focusing now on the results of the assimilation, the way that it modifies the model energy spectrum is investigated. For this reason, several cases of single-observation inputs in WAM were studied. The corresponding results were almost identical in all tests and a typical case is presented here as an example. More specifically, a grid point was selected where the WAM first-guess was 0.7 m, whereas the analyzed wave height was estimated to 0.9 m. In this case, the directional and the frequency energy distribution remained unchanged after the data assimilation, increasing, however, the total energy amount (Figs. 6-8). This was an expected outcome, since only significant wave height is assimilated and therefore no other directional or frequencial information is available. The assimilation impact lasted almost 6 h, as clearly indicated in Figs. 5-7. However, this period becomes longer for higher differences between model outputs and satellite records while the general attribute remains similar.
In the sequel, an assimilation impact factor is derived by comparing the deviations between the results of WAM with the data assimilation and the reference run -no assimilation -denoted as ASSIMDIF (y-axis), to the deviations between satellite measurements and the model with no data assimilation -SATDIF (x-axis), as shown in Fig. 9. The large number of low values for ASSIMDIF in the first plot (Mediterranean) is partly due to the previous mentioned high correlation between satellite measurements and the model's results without data assimilation. Moreover, some extreme differences between satellite data and model outputs seem to be smoothed by the assimilation. On the contrary, in the area of the Indian Ocean, the adjustment of the values to the diagonal line indicates that the activation of the assimilation module reduces the initial differences emerging between observations and forecasted results.
A detailed study concerning the assimilation impact at each model grid point, as well as its evolution in time follows. More precisely, in Figs. 10 and 11, the average RMSE and mean difference between the results of the two models (with and without the assimilation) for the whole testing period and for different forecasting hours are presented. Concerning the time evolution of assimilation impact, the maximum deviations seem to emerge at 18:00 UTC in the Indian Ocean and at 00:00 UTC in the Mediterranean ("assimilation time" for each domain, respectively), due to the different time that satellite tracks pass over these areas. This influence in the Mediterranean Sea is reduced after a period of less than 12 h, while in the Indian Ocean it remains significant for the whole forecasting period as a result of the "longer memory" of swell waves. Moreover, this impact seems to be translated    a. b.
d.   to northern latitudes due to the usually prevailing southerly waves.
On the other hand, it is clear that in the Mediterranean Sea the assimilation influence is quite local, while the whole Indian Ocean domain is affected. Additionally, the maximum values of both statistical parameters used are lower in the Mediterranean (RMSE<0.3 m, −0.2 m < Mean Difference <0.1 m) compared to those of the Indian Ocean (RMSE<1.4 m, 0 m < Mean Difference <1.0 m). This is mainly due to the local characteristics of the Mediterranean basin: closed sea with well separated sections where a possible correction in one area remains restricted there and does not propagate. On the contrary, the fact that the Indian Ocean is swell-dominated, as well as the increased initial differences between RA2-altimeter and model results without assimilation, led to an extended impact in this area. At the same time, a number of extreme differences which emerged between the model without assimilation and satellite records in the Indian Ocean (as shown in Fig. 3)   almost doubled (Fig. 4). These differences were due to a 3day failure in the atmospheric and wave forecasting as a result of a misplaced forecasted storm. However, the use of the assimilation scheme reduced these differences, thus leading to the conclusion that the data assimilation has made up for the shortcomings in the wave model results.
Focusing now on the average statistical results concerning the entire domain of each case, Table 3 illustrates that altimeter data assimilation procedure had no obvious impact in the Mediterranean Sea. Note that the statistical parameters used refer to RA2 observations against the closest WAM grid point at the assimilation time. On the contrary, in the Indian Ocean, the initially observed differences between satellite measurements and model outputs were reduced and their correlation becomes remarkably higher. It is also worth noticing that the low initial correlation between RA2 measurements and WAM outputs in the Indian Ocean is partly due to the short period of model failure which was referred earlier.
In order to further evaluate the wave model forecasts, a detailed comparison against available buoys is presented. For the Mediterranean Sea, the buoy data have been provided by the Spanish Puertos del Estado (P.E.) and the Hellenic Center for Marine Research (HCMR). Their location is shown in Figs. 12,13 and      that the number of satellite measurements passing the assimilation quality control is limited for areas close to coasts or islands, as in the Aegean Sea. However, whenever sufficient altimeter measurements are available, the benefit from the assimilation impact is clear, although temporary (as shown in Fig. 14), significantly improving the underestimated -in general -WAM forecasts. For the domain of the Indian Ocean, two buoys have been employed, as provided by the National Institute of Ocean and Technology (NIOT) of India. Their position and geographic coordinates are illustrated in Fig. 15 and Table 5, respectively. The comparison between the two versions of WAM (with and without assimilation) and these buoys show a clear improvement in the performance of the model. More precisely, the mean error of the results is reduced significantly by the use of altimeter data. The relatively low correlation coefficient may be attributed to the small significant wave heights dominating during the analysis period. The corresponding time-series, along with statistical results are presented in Fig. 16.

Conclusions
The performance of a data assimilation scheme using altimeter-RA2 records in high resolution versions of the WAM model has been tested in two different wave climate regions: the Mediterranean Sea and the Indian Ocean. The local characteristics of these domains affect the quality as well as the amplitude of the assimilation impact. More precisely, the influence of the assimilation scheme in the Mediterranean Sea is limited mainly due to: -the lack of a sufficient number of observation data. This is a result of the fact that satellite measurements in coastal areas are usually of low quality and, therefore, fail to pass the quality controls, -the fact that the assimilation analysis scheme does not correct the wind input from the atmospheric model but directly the forecasted significant wave height. This is a disadvantage of the method in wind-sea dominated areas, like the Mediterranean Sea, -the local characteristics of the Mediterranean basin: it is a closed sea with complicated coast line, several islands and well separated sections where a possible (assimilation) impact in one area remains restricted and diminishes there, -the high correlation of the filtered satellite records with model first guess.
However, the final forecasting quality of the wave system (WAM+assimilation module) is at a very satisfactory level. On the other hand, for the Indian Ocean, the large number of available satellite observations, combined with the fact that this is a swell dominated area, leads to an increased assimilation impact extended to the entire domain. In particular, the assimilation of altimeter data leads to: -the reduction of model error, -the elimination of any possible significant differences between the WAM first guess and satellite records.