Geophysicae Applications of Kalman filters based on non-linear functions to numerical weather predictions

This paper investigates the use of n n-linearfunctions in classical Kalman filter algorithms on the improvement of regional weather forecasts. The main aim is the implementation of non linear polynomial mappings in a usual linear Kalman filter in order to simulate better non linear problems in numerical weather prediction. In addition, the optimal order of the polynomials applied for such a filter is identified. This work is based on observations and corresponding numerical weather predictions of two meteorological parameters characterized by essential differences in their evolution in time, namely, air temperature and wind speed. It is shown that in both cases, a polynomial of low order is adequate for eliminating any systematic error, while higher order functions lead to instabilities in the filtered results having, at the same time, trivial contribution to the sensitivity of the filter. It is further demonstrated that the filter is independent of the time period and the geographic location of application.


Introduction
It is well known that Numerical Weather Prediction (NWP) models usually exhibit systematic errors in the forecasts of certain meteorological parameters especially near the surface.This drawback is a result not only of the shortcoming in the physical parameterization, but also of the inability of these models to handle successfully sub-grid scale phenomena.The model horizontal resolution associated with smoothing/averaging the orographic and landscape characteristics leads to weak representation of local effects on the airflow (e.g.systematic underestimations in the wind speed).

Increasing the model resolution may provide considerable
Correspondence to: G. Kallos (kallos@mg.uoa.gr)improvement in the representation of smaller scale flow characteristics.Nevertheless, an open question remains as to whether the use of higher resolution limited area models improves the forecast skill considerably, while, even if this is true, it is still uncertain whether such improvement compensates for the usage of increasing computational resources required for these applications (Mass et al., 2002).In addition, inaccuracies in the NWP model forecasts may also be due to possible errors in the initial and lateral boundary conditions as well as to interpolations to areas that are not close to model levels.
In order to reduce the influence of the above mentioned drawbacks in the final output of a NWP model, a variety of approaches based on statistical methods has been used.Most of them are derived from Model Output Statistics (MOS), which are able to account for local effects and seasonal changes.However, discrepancies have been found in MOS applications in cases of short time local weather changes or updates of the atmospheric model in use (see e.g.Landberg, 1994;Joensen et al., 1999).
One of the most successful approaches to this problem is the use of Kalman filters (Kalman, 1960;Kalman and Bucy, 1961;Bossanyi, 1985;Persson, 1990;Dragulanescu, 1993;Kalnay, 2002;Galanis and Anadranistakis, 2002;Crochet, 2004;Giebel, 2000).They consist of a set of mathematical equations that provides an efficient computational solution of the least square method.In practice, the Kalman filter is the statistically optimal sequential estimation procedure for dynamic systems.Observations are recursively combined with recent forecasts using weights that minimize the corresponding biases.
The main advantage of this statistical methodology is the easy adaptation to any alteration of the observations as well as the fact that it may utilize short series of background information.Kalman filters can be widely used for pure meteorological purposes as well as for several other applications, e.g.wind power prediction (c.f.ANEMOS G. Galanis et al.: Applications of Kalman filters based on non-linear functions project 1 ).The structure of Kalman filter algorithms is more suitable to describe linear procedures.For this reason, their application on meteorological parameters that follow a nonlinear or discontinuous behavior is always dubious.
The aim of this paper is twofold: At first, a new way for the encapsulation of non linear polynomial functions in a classical linear Kalman filter algorithm is proposed.In this way, a more convenient method for the description of the major part of meteorological parameters is obtained.Thereafter, a detailed study of Kalman filtering application to NWP data, leading to the clarification of the optimum polynomial for such type of simulations is illustrated.The results are based on NWPs and observations of temperature and wind speed obtained at two different locations in south Europe and for different time periods.Further, the obtained optimal filter is applied to a longer data set (of one year) for which the improvement of the direct model output is revealed.

Description of the methodology
This section initially describes the general form of a Kalman filter using the unified notation proposed by Ide et al., 1997.The main goal is the simulation of the evolution in time of an unknown process (state vector), whose "true" value at time t i is denoted here by x t (t i ).A relevant known array (odbervations) y O i at the same time is also utilized.The change of x in time is described by: and the relation between the observation vector and the unknown one is: The matrices M i (system operator) and H i (observation operator) have to be determined before the application of the filter.This is also the case for the covariance matrices Q(t i ), R(t i ) of the Gaussian with zero mean (by assumption) and independent random vectors η(t i ) (noise process) and ε i .
The Kalman filter provides a method for the recursive estimation of the unknown state x t based on all observation values y O up to time t i .A first (forecast) step of the state vector and its error covariance matrix P, based only on the previous' time step analysis values, is given by: This is followed up by an update (analysis) step in which the observation available at time t i is blended with the previous information: 1 http://anemos.cma.fr,http://forecast.uoa.gr/anemos/ where is the Kalman gain that arranges how easily the filter adjusts to possible new conditions.Note that the superscripts o, t, f , a denote observations, true, forecast and analysis value correspondingly.Moreover, T , −1 denote the transpose and the inverse matrix while I stands for the unitary matrix.
Equations ( 1)-( 5) update the Kalman algorithm from time t i−1 to t i .
During the last years, Kalman filters of the above type have been used for meteorological purposes and especially for improving weather forecasts.However, the linear form of the algorithm as well as the special type (non continuous in some cases) of wind speed time series, are important drawbacks for such applications on the prediction of wind parameters, affecting significantly the final outcome.Therefore, while the application of Kalman filters for the improvement of local air temperature predictions seems, in most cases, to be successful, analogous work for wind speed predictions may lead to poor results.Some previous work on this subject can be found in Bossanyi, 1985;Giebel, 2000.Our approach is based on the non linear correction of forecast bias using Kalman filters.In particular, it focuses on the study of a single meteorological parameter in time, based on the estimation of the bias of this parameter as a function of the forecasting model direct output.Specifically, let m i denoting the direct output of the model at time t i referring on one parameter (temperature or wind speed) that we estimate at each case.Let also y O i be the bias of this forecast.We realize it by means of m i as a polynomial: where the coefficients (a j,i ), j=0,1,. . .,n-1, are the parameters that have to be estimated by the filter and ε i the Gaussian, non systematic, error of the previous procedure.Note here that y O i , ε i are scalar variables contrary to the general case Eq. (2) where they are vectors.
We consider, in other words, as state vector the one formed by the coefficients (a j,i ), namely, x(t i ) = a 0,i a 1,i a 2,i . . .a n−1,i T , as observation the (scalar here) bias y O i , as observation matrix H i = 1 m i m 2 i . . .m n−1 i and as system matrix the identity matrix I n .In this way, the system and observation equations take the following form: This estimation may be of a linear form by choosing as order n=2, or a polynomial one of arbitrary degree.We succeed in this way to study non linear procedures, and therefore to improve the numerical predictions of several meteorological parameters of non-linear behavior.
The covariance matrices Q(t i ), R(t i ) are defined explicitly in Sect.3.

The optimal filter
The algorithm presented in the previous section offers the advantage of employing non-linear functions for the elimination of the systematic error in the forecast of a specific meteorological parameter.Moreover, one can use any natural number (at least theoretically) as the order of non-linearity, increasing in this way the sensitivity of the filter.The question then rising is: "To which extent the increase in the order of the polynomial in use influences positively the performance of the filter?", in other words, "Is there an "optimum" polynomial that ensures the maximum credibility of the filter?" In this section the application of different order polynomials is examined in order to extract the optimum order based on the best performance of the filter in improving the NWP data, in combination with the minimization of the required CPU time.The NWP data used in this study were extracted from SKIRON model (see Kallos, 1997;Papadopoulos et al., 2001).SKIRON, based on the Eta model (Janjic, 1994), runs operationally at the University of Athens providing 5-day forecasts2 .It is a full physics non-hydrostatic model with sophisticated convective, turbulence and surface energy budget scheme.It has several unique capabilities making it appropriate for regional/mesoscale simulations in regions with varying physiographic characteristics.SKIRON uses NCEP/GFS meteorological data for initial and lateral boundary conditions for operational purposes at a resolution of 1 • , and SST (Sea Surface Temperature) data at a resolution of 0.5 • .Vegetation and topography data are applied at a resolution of 30 and soil texture data at 2 .The variance of the sub-grid scale topography is also taken into account.The domain of the model covers the entire Mediterranean region with a horizontal increment of 0.1 • ×0.1 • (i.e. 10 km×10 km approximately), while 38 Eta levels are used in the vertical.
SKIRON NWP data of long time periods, i.e. at least one month forecast of 5-days horizon, have been provided for wind energy production purposes in the framework of ANEMOS project.In particular, SKIRON forecasts, namely wind speed and direction, air temperature and mean sea level pressure, have been supplied for different locations in the Mediterranean where wind farms operate.For the specific case study, SKIRON NWP data, namely, air temperature at 2 m and wind speed at 10 m above the ground for the area of Alaiz, Spain (grid point used: W 1.6 The use of these extended time periods, apart from increasing the credibility of the statistical analysis, ensures also that the final performance of the proposed filters will not be seriously affected by the choice of the initial conditions.This issue is also supported by the definition of all initial values for filter parameters.More precisely, the initial value of the state vector x is zero, assuming, in this way, that the initial bias of the forecasting model in use is non-systematic: e. the identity matrix with dimension equal to the order of the filter in use), R(t 0 )=6 (a sufficiently large estimation leading to quick independence from initial conditions).The selection of these values for the variables, leads also to an initial Kalman gain that contributes to fast adaptability of the filter to "any possible new conditions Eq. ( 5)".
There is a long history of successful Kalman filter applications for air temperature (see e.g.Persson, 1990;Dragulanescu, 1993).On the other hand, the non linear evolution in time of wind speed sets under question the possibility of successful application of classical Kalman filters.In the present study the implementation of non linear functions in the classical (linear) Kalman algorithm made possible the simulation of such type of time series.
A wide range of different forecast times is used in order to avoid any possible forecasting time dependences.Our statistical analysis was based on the: -Bias of forecasted (filtered or not) values: where obs(i) denotes the recorded (observed) value at time i, for(i) the respective forecasted value (direct model output or improved forecast via the proposed filter) and k the size of our sample.Bias is a crucial parameter for Kalman filtering.Any type of Kalman methods aims at eliminating the standard error and, thus, at vanishing the corresponding biases.The fulfillment of this requirement is the main criterion ensuring the credibility of the filter.
-Standard deviation of the bias: ))) 2 , (12) The latter are objective estimators of Q(t i ), R(t i ) respectively since the variables η(t i ) and ε i , denoting the nonsystematic part of errors in Eq. 7, follow the normal distribution by assumption.
The time period for the estimation of variances was determined to 7 days after a series of tests which led to the conclusion that this short time interval is adequate to obtain significantly improved forecasts with the application of the filter.On the other hand, this choice allows fast adaptability to possible data alternations and, at the same time, does not create needs for extended data storage.
A part of these tests is presented in Fig. 1, where the filter has been used for a period of 1 year based on different time intervals for the estimation of variance matrices: 3, 7, 15 and 30 days for both meteorological parameters in study (temperature and wind speed).The results indicate that no other choice leads to significantly improved results.
Additionally, this is supported by the results of absolute bias presented in Tables 2 and 4 for Cases I and II, respectively.The significant reduction of the absolute bias for the Kalman filters of order of two to four ensures that the discrepancy between the two time series (observed and forecasted) has been considerably decreased despite any possible changes in the type of error.

Average
Any attempt to use higher order polynomials, even by restricting the corresponding Kalman gain, does not lead to an additional improvement of the sensitivity of the filter.On the contrary, several instabilities in the corresponding time series emerge (see Fig. 2), and the overall performance of the filter seems to deviate from its optimum values.These instabilities, resulting in remarkably increased coefficients of the models in use (i.e. a j,i of Eq. ( 6) obtain values greater than 100), are presented in Fig. 3 as a percentage of the total number of outputs.This figure shows the analysis of the 120 h forecast cycle for temperature and the 72 h forecast cycle for wind speed.It is obvious that any polynomial of degree greater than 5 leads to a large percentage of instabilities.This is further supported in Figs. 4 and 5 where the bias and absolute bias refer to all available data (covering any forecasting period) and all different types of Kalman filter.Moreover, the standard deviation of the bias is significantly increased in the cases of filters of order higher than 4 (see Fig. 6), a fact providing further support to the above remarks.
Based on these results, a third order polynomial Kalman filter is selected as the optimum one which was proven to eliminate the systematic error from both air temperature and wind speed predictions.The consistency of the performance  of this optimum filter is assessed for a longer time period of one year (2003) using the same statistical analysis.Figures 7 to 9 focus on the variation of bias, absolute bias and standard deviation of bias of the direct model output and the Kalman filtered output (of order three) with the forecasting period for both air temperature and wind speed.Apart from the obvious improvement in the final forecast, it is worth noticing that this positive influence remains invariant, or even improves, with the forecast time.
The time series illustrated in Fig. 2 further support the significant improvement in the model output due to the application of Kalman filtering based on low order polynomials for both air temperature and wind speed.It is obvious that the systematic error is eliminated regardless its type (underestimation or overestimation).On the other hand, the use of a higher order Kalman filter gives better results only in isolated cases, while it creates considerable instabilities when the type of bias is alternated.Note that the cases presented in Fig. 2 are characteristic and representative of the general case since similar results appeared in several different selections of the polynomial order.

Conclusions
A new methodology of implementing non-linear polynomial functions in the classical linear Kalman filter algorithms is proposed.Different order polynomials were examined, based on time series of two meteorological parameters with different type of behavior, namely the air temperature and the wind speed.The former has a long history of successful compatibility with Kalman filters, while the non continuous evolution in time of the latter may lead to important drawbacks.Our methodology showed high performance for both cases at low computational cost.More precisely, this study suggests that: -A low order polynomial, of second to fourth order, seems to be the optimal choice that guarantees the successful elimination of any type of standard bias (linear or not) strongly contributing, in this way, to a successful final forecast.
-Higher order polynomials do not increase the sensitivity of the Kalman filter in use, while they require increased CPU time.Moreover, they create, in several cases, intense discontinuities in the filtered forecast.
The credibility of the proposed method gives the opportunity for further use and applications.In this way, it can be implemented in the main forecast model as a post-processing method or it can be activated in certain time intervals during the integration, smoothing any possible temporary discontinuity that a rapid change in the time series could produce.It could be also exploited as a pre-processing method provided that an operational observation network is available.
On the other hand, there is a wide range of potential applications beyond the traditional meteorological use, e.g.wind energy predictions and forest-fire fighting operations as well as image processing.Topical Editor F. D'Andrea thanks two referees for his help in evaluating this paper.

Fig. 2 .Fig. 2 .Fig. 3 .
Fig. 2. Time series of the (a, c) air temperature and (b, d) wind speed observed, forecasted and Kalman filtered for different forecasting hours with four different polynomials Fig. 2. Time series of the (a), (c) air temperature and (b), (d) wind speed observed, forecasted and Kalman filtered for different forecasting hours with different polynomials.

Fig. 4 .Fig. 4 .
Fig. 4. Bias and absolute bias of (a) temperature and (b) wind speed for model and different Kalman filters outputs (Case I) Fig. 4. Bias and absolute bias of (a) temperature and (b) wind speed for model and different Kalman filters outputs (Case I).

Fig. 5 .Fig. 5 .
Fig. 5. Bias and absolute bias of (a) temperature and (b) wind speed for model and different Kalman filters outputs (Case II) Fig. 5. Bias and absolute bias of (a) temperature and (b) wind speed for model and different Kalman filters outputs (Case II).

Fig. 6 .Fig. 6 .Fig. 7 .Fig. 7 .
Fig. 6.Standard deviation of bias for temperature and wind speed for model and different Kalman filters outputs in (a) Case I and (b) Case II Fig. 6.Standard deviation of bias for temperature and wind speed for model and different Kalman filters outputs in (a) Case I and (b) Case II.

Fig. 8 .Fig. 8 .
Fig. 8. Variation of (a) temperature and (b) wind speed Absolute Bias with forecast time for the direct model output and after the application of order-3 Kalman filter, for the year 2003Fig.8. Variation of (a) temperature and (b) wind speed Absolute Bias with forecast time for the direct model output and after the application of order-3 Kalman filter, for the year 2003.

Fig. 9 .Fig. 9 .
Fig. 9. Variation of (a) temperature and (b) wind speed Standard Deviation of Bias with forecast time for the direct model output and after the application of order-3 Kalman filter, for the year 2003 Fig. 9. Variation of (a) temperature and (b) wind speed Standard Deviation of Bias with forecast time for the direct model output and after the application of order-3 Kalman filter, for the year 2003.
• , N 42.7 • , 672 m above mean sea level), for December 2003 and for Rokas, Crete (grid point used: E 26.2 • , N 35.2 • , 480 m above mean sea level) for the whole year 2003 were available.At the same time observations of temperature at 2 m and wind speed at 55 m at Alaiz and at 40 m at Rokas were provided in ANEMOS project.Kalman filtering was applied to the wind speed at 55 m and 40 m as it is calculated from the model forecasts at the nearest model levels (weighted interpolation).Hereafter, Alaiz will refer to as Case I and Rokas as Case II.

Table 1 .
Biases of direct model output and Kalman filters -Case I.

TABLE 1 .
Biases of direct model output and Kalman filters -Case I

Table 2 .
Absolute Biases of direct model output and Kalman filters -Case I.

TABLE 2 .
Absolute Biases of direct model output and Kalman filters -Case I

.72 Wind Speed Forecast period model Kal 1 Kal 2 Kal 3 Kal 4 Kal 5 Kal 6 Kal 7 Kal 8 Kal 9 Kal 10
refers to Case I for December 2003 and Table 3 contains the results for Case II for the period March to May 2003 (this 3-month period

Table 3 .
Biases of direct model output and Kalman filters -Case II.

TABLE 3 .
Biases of direct model output and Kalman filters -Case II

Table 4 .
Absolute Biases of direct model output and Kalman filters -Case II.

TABLE 4 .
Absolute Biases of direct model output and Kalman filters -Case II