Electron radiation belt data assimilation with an ensemble Kalman filter relying on the Salammbô code

In this study we implement a data assimilation tool using a 3-D radiation belt model and an ensemble Kalman filter approach. High time and space reanalysis of the electron radiation belt fluxes is obtained over the time period 5 October to 25 October 1990 by combining sparse observations with the Salammb̂ o 3-D model in an optimal way. The convergence of the ensemble Kalman filter is analyzed carefully. The risk of using a biased physical model is discussed and relative consequences are highlighted. Finally, a validation against CRRES data and major improvements compared to pure physics based model are presented.


Introduction
The natural energetic electron environment in the Earth's radiation belts is of general importance as dynamic variations in this environment can impact space hardware and contribute significantly to background signals in a range of other instruments flying in that region.The most dramatic changes in the relativistic electron populations occur during enhanced periods of geomagnetic activity (Baker et al., 1982(Baker et al., , 1990(Baker et al., , 1994(Baker et al., , 1997)).As early as 1966, Williams related periodic increases in the trapped relativistic electron populations to increases in the solar wind kinetic energy density.But Reeves et al. (2003) have found that geomagnetic storms can either increase or decrease the fluxes of relativistic electrons in the radiation belts: half of all storms increased the flux of relativistic electrons, one quarter decreased the flux and one quarter produced little or no changes in the fluxes.Nevertheless, persistent two or three orders of magnitude flux in-tensifications of electrons with energy in the range of 0.001-10 MeV occur regularly.
These observations clearly demonstrate that the relative importance of all competing physical processes involved in the radiation belt dynamics changes from storm to storm and the net result on particle distribution might then be very different.Several mechanisms and models have been proposed for the electron radiation belts dynamics (see reviews from Friedel et al., 2002;Millan and Thorne, 2007;Bortnik and Thorne, 2007;Shprits et al., 2008a, b).Radial diffusion driven by ULF waves is one important mechanism for electron acceleration (Schulz and Lanzerotti, 1974;Brautigam and Albert, 2000;Miyoshi et al., 2003) but is balanced by localized electron acceleration due to resonant interactions with whistler mode chorus waves (Horne and Thorne, 1998;Summers et al., 1998;Meredith et al., 2002;Horne et al., 2005;Thorne et al., 2005;Varotsou et al., 2005Varotsou et al., , 2008;;Shprits et al., 2006aShprits et al., , 2009;;Li et al., 2007;Albert, 2007Albert, , 2008;;Xiao et al., 2010).Even if the key physical processes that govern the dynamics of radiation belts are identified today, their temporal evolution (dynamics) and balancing is not well understood.These are the main limitations of the current pure physics-based models of radiation belt dynamics.
The most common practice is to deduce empirical formulae of physical processes amplitudes versus one or more proxies like Kp, Dst, solar wind parameters from statistical studies (e.g.Brautigam and Albert, 2000;Meredith et al., 2003bMeredith et al., , 2007Meredith et al., , 2009)).On one hand, this introduces errors in the system, which becomes even more important for high magnetic activity conditions for which statistics are usually poor.On the other hand, past observational studies use data from single or limited multiple points in space and have only been able to rigorously distinguish between possible acceleration processes in one or two cases (e.g.Chen et al., Published by Copernicus Publications on behalf of the European Geosciences Union. 2007).Indeed, these studies are limited due to poor coverage of measurements (Friedel et al., 2000).To have a good representation of what really happens during magnetically active periods, several spacecrafts have to be at the "right location" at the "right time", which is rarely the case.In the recent years, to promote progress in this area, data assimilation tools have been developed.They allow combining sparse and inaccurate space borne observations with physicsbased dynamic models in an optimal way.Data assimilation techniques range from "simple" (like direct assimilation, e.g.Bourdarie et al., 2005;Maget et al., 2007) to extremely complex with prohibitive computation times (like Kalman filters, Kalman, 1960;Kalman and Bucy, 1961).They have already been proven as a valuable tool able to guide "the best" estimate of the state of a complex system in the fields of oceanography (Evensen, 1994), ecological systems (Eknes and Evensen, 2001), meteorology (Houtekamer, 1998), ionosphere (Schunk et al., 2004) and radiation belts (Kondrashov, 2007).Naehr and Toffoletto (2005) as well as Shprits et al. ( 2007) illustrated using synthetic data how a Kalman filter may be applied successfully to radiation belt forecasts.More recently, the applicability of data assimilation techniques to the radiation belts has been proven despite the uncertainties that are difficult to assess.A validation study by comparing reanalysis from two independent data sources (Ni et al., 2009a), as well as sensitivity studies according to the magnetic field model being used (Ni et al., 2009b) and to the assumed boundary conditions (Daae et al., 2011), have been achieved.In the radiation belt domain, data assimilation tools being developed so far rely on simple physical models.They are in most cases 1-D, i.e. according to L * , where only radial diffusion is considered (Naehr and Toffoletto, 2005;Koller et al., 2007;Kondrashov et al., 2007).In Koller et al. (2007) and Shprits et al. (2007), data assimilation with a 1-D physical model has been used to study radial profiles of phase space density.In the present paper, we describe a data assimilation tool based on the 3-D Salammbô code (Varotsou et al., 2005(Varotsou et al., , 2008) ) and an ensemble Kalman filter (Evensen, 1994).
In Sect.2, we describe the physical model used here to estimate the electron radiation belt variability.Section 3 presents in-situ data being assimilated and reference data used to validate the simulations.Section 4 describes the methodology for assimilating data.Data assimilation results and associated discussion are presented in Sect. 5. We summarize our findings and conclude in Sect.6.

Physical model
In any data assimilation tool, a physics-based model is necessary to provide a temporal and spatial estimate of the state of the system under study.Salammbô is a set of codes devoted to the understanding of radiation belt creation and dynamics which occur during magnetic activity periods (Beutier and Boscher, 1995;Bourdarie at al., 1996;Varotsou et al., 2005).
The classical Fokker-Planck diffusion equation for electron radiation belts in the 3-D phase space is presented in Eq. (1).It is written in terms of the particle energy (E), sine of the equatorial pitch angle of particles (y) and L, the Roederer parameter (Schulz and Lanzerotti, 1974;Shprits et al., 2008b), so this version is a real 2-D in space, for which the results are averaged on the longitude (local time) where T (y) is a function corresponding to the bounce frequency and can be found in Schulz and Lanzerotti (1974), a is the Jacobian from J 1 and J 2 (the first two adiabatic invariants) to E and y, and f the phase space density (PSD).
In the Salammbô code, the diffusion processes acting on particles are modelled by their respective diffusion coefficients (radial, energy, pitch angle and energy-pitch angle diffusion).The non diffusive energy losses are described by friction terms.The mixed terms are ignored in this preliminary study.On one hand, Albert and Young (2005) indicated that mixed terms are negligible at higher pitch-angles (which is the majority of radiation belt electrons), but on the other hand, Subbotin et al. (2010) showed that for the long term simulations mixed terms can significantly contribute to the radiation belt dynamics even for near equatorially mirroring pitch-angle particles.Most probably, the uncertainties in current wave models introduce bigger errors than the neglect of the mixed diffusion as suggested by Subbotin et al. (2011).
At low L * values (L * < 2), scattering by the upper atmosphere, described by Mass Spectrometer Incoherent Scatter (MSIS) 86 model (Hedin, 1987), is the predominant loss process.At higher L * values, in the plasmasphere, wave-particle interactions give rise to pitch angle diffusion that also leads to particle losses.The waves considered here are Hiss, VLF transmitters and whistlers from Abel and Thorne (1998).In the outer belt, two major processes are in competition.The first one is radial diffusion driven by magnetic and electric field fluctuations in the inner magnetosphere (Schulz and Lanzerotti, 1974).Radial diffusion transports particles across magnetic fields lines, changing their energy in the same time due to the conservation of the particle's relativistic magnetic moment (i.e. the first adiabatic invariant).Their energy increases when they are transported inward and decreases when they are transported outward (the direction of the diffusion depends on the PSD gradient).The second process consists of an "internal source" of relativistic electrons outside the plasmasphere.Summers et al. (1998) and Horne and Thorne (1998) identified whistler mode chorus as a possible agent for leading to substantial energy and pitch angle diffusion for relativistic electrons, even beyond geostationary orbit.During moderate magnetic activity, the local acceleration timescale becomes faster than the radial diffusion one and electrons can be energized.Chorus wave interactions with electrons modelled by the PADIE code (Glauert and Horne, 2005) are considered here, in the same form as in Varotsou et al. (2005).The diffusion coefficients were related to magnetic activity by constructing a statistical wave model where equatorial values (−15 • < λm < 15 • ) of f pe /f ce and wave intensity B 2 wave measured by CRRES were parameterized for Kp < 2, 2 ≤ Kp ≤ 3, 3 ≤ Kp ≤ 4, 4 ≤ Kp ≤ 5, 5 ≤ Kp ≤ 6 and Kp ≥ 6 between L = 1 to 6.6, with a resolution of 0.1 L and 1 h in MLT (Meredith et al., 2003b).The coefficient values from the matrix given by the PADIE code were interpolated to energy, pitch angle and L values corresponding to the Salammbô grid and to f pe /f ce values corresponding to the ones given from the statistical wave model (CRRES data).For a given energy, L, pitch angle and Kp, the diffusion coefficients were calculated in each MLT bin according to f pe /f ce and B 2 wave .Finally, for introduction in the Salammbô code, we calculated the coefficients' drift average by summing values over all MLT and dividing by the number of MLT bins.Since electron-chorus interactions are most efficient for low f pe /f ce and high wave intensities (Meredith et al., 2003b), they were only included in the model outside the plasmasphere.Note that the absence of chorus wave at high latitude in our model may lead the Salammbô code to overestimate fluxes for high latitude mirroring electrons.Future studies will mostly concentre on improving wave models (wave amplitude and propagation angle) to reduce uncertainties on associated diffusion coefficients.
To reconstruct the radiation belt dynamics, the planetary disturbance index Kp is used to: -parameterize the intensity of radial diffusion according to Brautigam and Albert (2000) D m LL = 10 0.506Kp−9.325L 10 (day −1 ); (2) -parameterize chorus wave intensity according to the statistical study performed by Meredith et al. (2003b), the same CRRES wave amplitude database was binned according to 6 Kp index classes instead of Ae index and provided by N. Meredith; -and also locate the plasmapause using the parameterization of Carpenter and Anderson (1992) where Kp max is the maximum Kp value over the preceding 24 h.
The numerical scheme used to solve Eq. ( 1) is explicit finite difference.For stability condition the Courant-Friedrichs-Lewy (CFL) condition has been implemented in the same way as in Albert et al. (2009) (see Eq. 9).A limitation of 0.1 has been used instead of 0.5, which aims at improving the stability of the code since the CFL condition is only a necessary condition.The upper bound of the time step from the CFL condition has been computed considering all diffusion coefficients part of the Salammbô code (some of which are static, others depend on Kp and others depends on Kp and on the maximum of Kp over the last 24 h -via the plasmapause location) over the time period 1990 to 2005.Therefore from time to time a lot of combinations may exist.Figure 1 presents the statistical distribution of this upper limit then obtained over more than a solar cycle.It shows that the upper time step is always greater than 1 s.To corroborate those results, three runs are performed with 3 different time steps, 0.1 s, 1 s and 10 s, during the 10 October 1990 storm.To measure the performance of the code, the 1s run is considered as the reference one.Then the mean ratio over the full simulation grid of the PSD deduced to the "reference" PSD is computed.Figure 2 presents the evolution of this ratio for a 10 s time step simulation in blue and 0.1 s time step simulation in red.While there is no difference between a simulation performed with a 1 s or a 0.1 s time step, the deviation between the 10 s time step and 1s time step keeps on increasing.In other words, the convergence is guaranteed for a 1 s time step but is not for the 10 s one.From this analysis, the time step is set to 1 s in the Salammbô code in order to ensure the stability of the explicit finite difference scheme.

The data set
The way in situ data are pre-processed is described by Friedel et al. (2005).To obtain data that is as reliable as possible, contamination, saturation, and background issues are treated and removed from the data.GOES proton fluxes are used as references to detect when data are contaminated by other particle species (i.e.ions).Then, using CRRES measurements as the "gold" standard, since it was the last scientific mission dedicated to the analysis of the Earth's environment, the different detectors and satellites are cross-calibrated.Even if the main artefact/bad data are filtered through pre-processing, the data set must be intercalibrated to insure consistency, especially for reanalysis purpose (see Friedel et al., 2005).
In the present study we concentrate on the magnetic storm of 10 October 1990 and the simulation covers the time range from 5 October to 25 October 1990.This storm is a reference one, because (1) it occurs during CRRES satellite life, (2) Brautigam and Albert (2000) found clear evidences that the PSD peaks inside geosynchronous orbit (i.e.indicating "local acceleration" by chorus wave-particle interaction), and (3) it has been intensively studied (see for example Brautigam and Albert, 2000;Meredith et al., 2003a;Iles et al., 2006).
The electron flux measurements from the LANL-GPS NS18/BDD2 and the LANL-GEO 1989-046/SOPA are assimilated while flux measurements from CRRES/MEA and CRRES/HEEF are kept for results validation.Figure 3 and only the omnidirectional differential channels deduced from CRRES original measurements are used in the following study.

Data assimilation using an ensemble Kalman filter
Data assimilation scheme provides a framework which allows for optimal combination of model prediction and sparse data accounting for model and data errors.One of its major advantages is the estimate of the uncertainties of the nowcast at each time step.Two approaches for filtered data assimilation are available (see Bertino, 2001;Evensen, 2007, and references therein for more details): -In the sequential methods, new observations are sequentially assimilated into the model when they become available.The assimilation process is derived from the theory of statistical estimation: the error statistics are used to calculate a variance minimizing estimate whenever measurements are available (Daley, 1991).This is typically what is performed in the Kalman filter (Kalman, 1960) and all its children.
-The variational inverse methods, contrary to the sequential methods which update the model solution every time observations are available, seek an estimate in space and time where the estimate at a particular time is dependent on both past and future measurements (Evensen, 2007).These methods are derived from the theory of optimal control where one seeks the estimate by minimizing the error terms in the form of a weak constraint penalty function.The 4D-VAR method (Le Dimet and Talagrand, 1986) is the most popular formulation of these methods and is used in meteorological operational prediction centers.Similarities exist between the analysis step in the variational and sequential methods.The 3D-VAR method (Le Dimet and Talagrand, 1986) was developed prior to 4D-VAR and does not account for the temporal origin of observations.In this case, the analysis step becomes equivalent to the Kalman filter's one, both corresponding to the best linear unbiased estimator (Lorenc, 1986).
In our work, our choice has been to implement a sequential data assimilation scheme based on the Kalman filter (Kalman, 1960) because it better fits Salammbô's framework.Such an approach can be decomposed into two main phases synthesized on Fig. 4: -Forecast phase is a temporal propagation of the system performed with the physical model accounting for model uncertainties.
-Analysis phase is where data assimilation method combines numerical model predictions and sparse data available at the time of the analysis phase, in a way which minimizes mean-squared errors.The update is weighted, global to the system and optimal.
The data assimilation scheme adopted assumes that data as well as model are unbiased.Such schemes are the so-called bias-blind schemes (Dee, 2005).An overview of the Kalman filter formalism applied to radiation belts can be found in Shprits et al. (2007) and detailed descriptions in references therein.In our case, the state vector is a vector of PSD at various L * , energy and sine of equatorial pitch-angle of electron.Its size, n, is 25 3 elements.Such a Kalman filter is known to be very efficient for small linear systems.Unfortunately when the system size increases, the propagation in time of the error covariance matrix (n × n) quickly becomes expensive to compute.In 1994, Evensen (1994) introduced the idea to approximate a distribution by a Monte-Carlo sampling.Of course, the larger the number of sampling, the better the model distribution will be represented.The error of the distribution sampling decreases according to 1/ √ m where m is the number of samples.The corresponding data assimilation scheme is known as an Ensemble Kalman Filter (EnKF in the following).A schematic view of the equivalence between the Kalman filter and the EnKF is given in Fig. 5.
In the following, bold symbols refer to a matrix and "ˆ" indicates it is an approximate.

Initialisation
Instead of propagating in time the initial state vector, X 0 , and the corresponding initial error covariance matrix, P 0 , Evensen (1994) states that it is equivalent to propagate an ensemble of state vectors sampled according to an initial multinormal distribution N (X 0 , P 0 ).Assuming an initial state vector with dimension n and considering m the number of members in the ensemble (i.e. the number of samples), we define the initial ensemble matrix A 0 with dimension n × m: In the Kalman Filter In the Ensemble Kalman Filter (EnKF) where the X i 0 (i ∈ [1, m]) are the m initial state vectors representing the initial distribution of the PSD and its dispersion.The average initial state vector is then defined by and the initial error covariance matrix (here namely the ensemble covariance matrix) by Note that if m tends to infinity, then of course X 0 ensemble = X 0 and P0 = P 0 .

Prediction phase
As shown in Fig. 6, each member is propagated forward in time (the forecast) with the physics-based model (i.e.Salammbô 3-D) according to: where M is the matrix of the numerical model, superscript "f" refers to forecast and subscript "k" indicates the time step.Suppose the true state of the radiation belt is known at time step k − 1, then because the model is not perfect the forecast at time step k is no longer the true state but only the best estimate we can have.Then the true state at time step k can be expressed as: where ε M k−1 should be the true model error vector, "t" superscript refers to true state and "M" superscript refers to model.However, it can be only approximated by the known a priori model error and can be represented by a model covariance matrix, ) .In the EnKF, model uncertainties can be sampled with a Monte-Carlo run within m samples (the number of members) according to a multinormal law N(0, Q k ).Thus the model covariance matrix is never computed but is implicitly given by the members' dispersion in the ensemble.It is however possible to approximate the error covariance matrix of the forecast at time step k, P f k , according to Eq. ( 7).

Analysis phase
Let us denote O k , the vector of q observations available at time k and their associated observational error vector ε o k where superscript "o" refers to observations.The observational error can be represented by an observational error covariance matrix: In the EnKF, the observational errors can be sampled with a Monte-Carlo run within m samples (the number of members) according to a multinormal law N (O k , R k ).A perturbation matrix with dimension q × m is then obtained ).The observational error covariance matrix can be then approximated by: Note that if m tends to infinity, then of course Rk = R k .
The standard Kalman gain matrix can then be approximated by: where the observational matrix H k maps the true space on to the observed space.
From the approximated gain, the analysis step in the EnKF consists of the following updates performed on each of the model state ensemble members: where "a" superscript refers to analysis.Indeed, the analysis phase statistically determines a new set of state vectors where the dispersion is reduced as schematically represented in Fig. 6.
In practice, it is not necessary to store in full the matrices shown above when implementing the EnKF.The main difference with the Kalman filter can be summarized by the propagation of a n×m matrix instead of a n×n.The runtime is thus much faster for very large systems.For a detailed description of the EnKF algorithm see, for example, Evensen (2004).

Limitation of bias-blind data assimilation tools
One of the major hypotheses of traditional data assimilation tools assumes that model and data are not biased; these are the bias-blind data assimilation schemes (Dee, 2005).From the model point of view, this can be difficult to ensure.Under this assumption, one must make sure that all physical processes playing a major role on trapped particles are included.This suggests that a too simple model like a 1-D model in which only radial diffusion is accounted for cannot be considered as a non-biased model to reproduced trapped electron dynamics.However, even a more sophisticated model such as a 3-D model does not guarantee that it is not biased.To effectively remove those discrepancies during the data assimilation process, one requires bias-aware assimilation methods which incorporate specific assumptions about the source and nature of (some of) the biases in the system, and are specifically designed to estimate and correct those biases (Dee and Todling, 2000;Evensen, 2003;Dee, 2005).
In a Kalman filter or an EnKF, each cell of the domain is linked to all others via the covariance matrix which itself relies on the accuracy of the physical modelling.Thus, during the analysis phase the update is propagated throughout the entire domain via the covariance matrix.In Sect.5.2, one of the identical-twin experiments is performed with a biased model and consequences of the use of a bias blind assimilation scheme are then discussed on the overall estimates of the EnKF.
In our particular case, a 3-D radiation belt model, the Salammbô code, will be used for the prediction phase.The model is now mature enough (see Varotsou et al., 2005Varotsou et al., , 2008) ) to assume that the all set of dominant physical processes are accounted for.Nevertheless, the chorus wave-particle interaction coefficients from Glauert and Horne (2005) are limited to electron energies less than 3 MeV and for L * range less than 6.6.As a result the model is biased for energies greater than 3 MeV at all L * and at all energies for L * values greater than 6.6.Another source of bias comes from the absence of chorus wave in our model for magnetic latitude below −15 • and above 15 • .As a result the model is biased for electron with low pitch-angle.Because of the 3-D nature of the model, it is assumed that PSD are symmetric in local time.This is almost true for relativistic electrons but this hypothesis is broken at low energy and preferentially at large L shells.To find out more accurately the energy limit below which this hypothesis is no longer true, a statistical study combining all LANL-GEO/MPA and LANL-GEO/SOPA data over more than two solar cycles has been performed in a similar way as Korth et al. (1999).Maps in an energy versus local time for two magnetic activity index, Kp = 1 and Kp = 6, are provided in Figs.7 and 8.It is clear that for low or high magnetic activity, the particle distributions are local time dependent for energies below 50 keV and peaks in the midnight-dawn sector due to storm-substorm particle injections.At Kp = 6, this asymmetry is still pronounced at 100 keV and is very weak for energies above 300 keV.Note that the local time dependence seen at 300 keV and above is due to day-night asymmetry of the magnetic field and cannot be attributed to night side particle injections.From this statistical study, it turns   out that the 3-D physical model, Salammbô 3-D, is biased for energies below 300 keV.
Finally, in the current study a low resolution grid (25 × 25×25) is used.As a result the steep gradients of the PSD observed close to the loss cone cannot be accurately predicted by the model.To speed up the resolution scheme the interpolation had to be set to linear in log(f ) − log(E) space to apply radial diffusion at constant M, K in the E, y space.

Settings for the EnKF
In the standard formulation of the Kalman filter, the model and observational error covariances matrices Q and R are assumed to be known.This rarely happens in practice and usually some simple approximations are made.In the current study, it is assumed that: -The error distribution of logarithm of observations is Gaussian with a standard deviation of 0.3.Friedel et al. (2005) found that comparing different satellite measurements of the same parameter against one another indicates discrepancies on the order of 2 in most cases.This value corresponds to the 1 sigma of the data error distribution adopted in the current study.
-The error distribution of logarithm of the initial state is Gaussian with a standard deviation of 0.3.The initial state is extracted from the run described in Maget et al. (2007) on 5 October 1990 at 00:00 UT where direct data insertion was performed.It is then natural to assume the same error distribution for the initial state than in the data being used (see above).Bourdarie et al. (2005) found that performance of direct data insertion in the Salammbô code was within a factor 2-3 Table 1.Time average of the root mean square deviation of the logarithm of phase space densities averaged over the entire simulation domain (E, α eq , L * ) versus the size of the ensemble.
Ensemble size 50 80 100 200 Mean RMS 0.65 0.12 0.12 0.115 when comparing to test data sets.Again, a factor 2 corresponds to the 1 sigma of the initial state error distribution adopted in the current study.
-Because in the Salammbô code the time dependent physical processes (radial diffusion and chorus wave particle interactions) are driven by the magnetic activity index Kp, the model errors can be modelled in a first approximation by assuming a Gaussian distribution of Kp with an average equal to the measured Kp and with a standard deviation of 0.5.Maget et al. (2007) showed that introducing mathematical error to radial diffusion coefficients and chorus wave particle diffusion coefficients separately was equivalent as a first approximation to add white noise to Kp index.
-the boundary condition at L * = L max is kept free.At start time it is set to a kappa distribution (Christon et al., 1991) given by the formula: where we take A = 10 35 MeV −3 s −3 , defined by examining a long period of LANL geosynchronous measurements, E 0 = 2 keV (plasma sheet characteristic energy), defined by average LANL geosynchronous MPA (Magnetospheric Plasma Analyzer) data (Joseph Borovsky, private communication, 2007) and k = 5, on the basis of the work of Christon et al. (1988Christon et al. ( , 1991)).Then the boundary condition is updated by the EnKF during the analysis phase.Between two updates the boundary condition is kept constant.
In the analysis phase, predicted fluxes corresponding to each available measurement are first computed from PSD provided by each member of the ensemble.Then, the innovation is calculated according to the logarithm of predicted and measured fluxes.

Convergence of the ensemble Kalman filter
The convergence of the ensemble Kalman filter is theoretically guaranteed if the number of members in the ensemble is very large.In the literature one finds regularly that the filter converges from 100 members (Evensen, 2004).To verify if this generally adopted rule can be applied to radiation belts science, the same simulation (see Sect. 3) has been performed three times while the number of members has been changed from 50 to 200.From 5 to 25 October 1990, GPS-NS18/BDD2 and LANL-GEO-1989-046/SOPA data are used to compute the innovation at each analysis phase, which is performed every two minutes of the simulation.The root mean square deviation (RMS in the following) of the ensemble (i.e. the standard deviation of the logarithm of the phase space densities of the ensemble members) has been averaged over the entire domain (E, α eq , L * ) versus time for each run and is plotted in Fig. 9.This RMS is a good measure of the robustness and convergence of the EnKF (Hamill et al., 2001).At first glance the EnKF seems to behave in a similar way if it is composed of 80, 100 or 200 members.
By cons, the one with 50 members differs from other simulations as soon as the magnetic storm develops (when Kp index increases up to 6).Clearly, after the magnetic storm, the EnKF with 50 members does not converge adequately.To quantify the performance of the filter, the time-average over the duration of the simulation of the RMS, defined above, is computed (Table 1).It is clear this time that the overall performance of the filter is significantly improved when the number of members is greater than or equal to about 80.
In the following the number of members is set to 200.

Observing system assessment with identical-twin experiments
One of the most widely used and reliable methods of assessing a data assimilation scheme is that of the twin experiments.The identical-twin experiments consist in a numerical procedure where synthetic data can be generated by the model to which data assimilation is applied, subject to a specified stochastic forcing term.The data with assigned errors are then evaluated for their effectiveness in obtaining optimal state estimates.The convergence of the unassimilated model fields from the second run towards those of the first run ("true" state) can then be measured.Two scenarios are implemented for a time period covering the 10 October 1990 storm: (1) synthetic integral omnidirectional fluxes every two minutes along GPS ns18 orbit and differential omnidirectional fluxes every 4 min along LANL-GEO-1989-046 orbit are deduced from the "true" state; and (2) synthetic integral omnidirectional fluxes every two minutes along GPS ns18 orbit, differential omnidirectional fluxes every 4 min along LANL-GEO-1989-046 orbit and differential omnidirectional fluxes every 2 min along CRRES orbit are deduced from the "true" state.In both cases, synthetic data channels are chosen to be identical to the ones of the corresponding instrument.The settings of the EnKF are those described in Sect.4.5.
To measure the efficiency of the data assimilation scheme adopted, the mean ratio over the full simulation grid of the mean predicted PSD by the ensemble to the "true" PSD is computed every 2 min (Fig. 10).First of all, in both cases the data assimilation scheme converge and the mean prediction by the ensemble is closed to the "true" state; in scenario (1) the deviation in within 16 % before the storm onset and on the order of 5 % after, while in scenario (2) deviation is within 12 % before the storm and on the order of 1 % after.
As expected, when more information is provided (i.e. when synthetic data along CRRES orbit are added), the results are even better.
Next, an assimilation experiment where a part of the domain suffers from a bias is performed.Synthetic integral omnidirectional fluxes every two minutes along GPS ns18 orbit, differential omnidirectional fluxes every 4 min along LANL-GEO-1989-046 orbit and differential omnidirectional fluxes every 2 min along CRRES orbit are deduced from the "true" state.The model then used for assimilation is biased, i.e. the wave-particle diffusion coefficients due to chorus waves are divided by a factor 10 for L * > 5.5.The mean ratio over the full simulation grid of the mean predicted PSD by the ensemble to the "true" PSD is computed every 2 min (Fig. 11).The results are clear: large deviations are recorded.The first two days the deviations are very large and then diminish down to 145 % with occasional increases up to 345 %.Clearly, bias in a part of the domain being simulated may affect the overall performance of the data assimilation tool.
The identical-twin configuration shows how, in an idealistic framework, the ensemble Kalman filter assimilation performs for the particular model and data being considered.Moreover, twin-experiments can performed to consider model bias, model errors, missing physics, etc.In the real world, of course, no models are perfect, and the results of identical-twin experiments, in the case of bias blind assimilation scheme, necessarily err on the optimistic side (Daley, 1991, pp. 340).In this case, a more objective comparison, but complementary to twin experiments, is then proposed in the following section.

Reanalysis for the 10 October 1990 storm
Because data assimilation is limited to the outer radiation belts (along GPS and GEO orbits), the simulation results (the reanalysis) will be analysed for L * > 4. GPS data are being assimilated every 2 min but only when the spacecraft can measure electrons with equatorial pitch-angle greater than 35 • .This last condition prevents ingestion of data where the model is biased due to the low resolution grid being used so far.LANL-GEO data are being assimilated every 4 min.Note that a lower cadence has been set for GEO data because the L * value where the measurements are available does not change much from time to time.
Since the updates are propagated via the implicit covariance matrix of the ensemble to the entire domain, there is no guarantee to retrieve the assimilated data out of the reanalysis (thought there is good chance anyway).A preliminary validation of the EnKF performance consists in a basic check in which the average predicted data are compared with the assimilated ones.Figure 12 provides such a comparison along LANL-GEO-1989-046 versus time for two channels, 0.315-0.5MeV and 1.1-1.5 MeV.First of all, it can be seen that the pure Salammbô code always overestimates the observations.The deviation can be as high as a decade or more for 0.315-0.5MeV channel.The mean predictions over all members of the ensemble match pretty well the observations: in the 0.315-0.5 channel 85 % of predictions are within a factor 2, 71.5 % are within a factor 1.5 and 48 % are within a factor 1.25; while in the 1.1-1.5 channel 97.5 % of mean predictions are within a factor of 2, 86.8 % are within a factor 1.5 and 63.5 % are within factor 1.25.The gain of using data assimilation tool accounting for model and data errors is clear: since in data assimilation, modelling uncertainties are accounted for, the EnKF is able to find an optimized "path" within the dispersion of diffusion process intensities (from modelling uncertainties by Monte-Carlo sampling) to get the best estimate.The pure Salammbô code is much more restrictive and does not have such a freedom.
To fully validate the EnKF performances, an independent data set is used.Reanalysis results are then compared to CR-RES/MEA and HEEF measurements for L * > 4. Error distribution (log (observations/mean predictions)), the errors (observations/mean predictions) versus observation values and errors (observations/mean predictions) versus L * are plotted on Fig. 13 for two electron energies 0.33 MeV and 1.58 MeV.Results obtained from 5 October to 25 October are melted in these plots.The top panels show evidence that the pure Salammbô code systematically overestimates fluxes by about a factor of 10 or more.The corresponding distribution is centered to about −1 in decimal log, whereas the one obtained with mean prediction from the ensemble Kalman filter is centered to about 0.Moreover, the distribution of errors in decimal log is larger when predictions from a pure Salammbô code are taken.This shows clearly the added value of the filtered data assimilation tool.On the middle panels of Fig. 13, one can see that the deviation between observations and predictions by the ensemble Kalman filter increases when measured fluxes decrease.This is obvious for 1.58 MeV electrons.Finally, it turns out that this deviation is large (and can be very large) for L shells greater than 6 (see bottom pan-  els of Fig. 13).Again for 1.58 MeV electrons it is obvious.This last result indicates a systematic bias of the model beyond L * = 6.This bias is known: there are no chorus waveparticles interactions included in the model beyond L * = 6.Thus, such results were expected.For L * below 6, i.e. a domain where chorus wave are included in the model, some predictions (a small fraction) can deviate by a factor 10 to CRRES observations.Although the chorus waves are taken into account, bias are suspected due to non inclusion of chorus wave at high latitudes, which is to say particle scattering to the atmosphere due to wave-particle interactions are underestimated in the model as suggested by Shprits et al. (2006b).It also highlights the importance of using a nonbiased model to obtain very good performances of the data assimilation tool at any grid cells.

Conclusions
A first data assimilation tool based on a 3-D radiation belt model, the Salammbô code, has been set up.The EnKF overcomes one major problem associated with the traditional Kalman filter: in a Kalman filter an error covariance matrix for the model state needs to be stored and propagated in time, making the method computationally infeasible for models with high-dimensional state vectors.
In this study we show how data assimilation using an EnKF may be applied to perform high fidelity electron radiation belt reanalysis from a sparse data set (3 channels along GPS and 6 channels along GEO orbits only were ingested).We also point out the danger of using a biased model in such applications while using a bias-blind assimilation scheme.
While simple approximations were made in this study to set data and model errors, a detailed assessment of each physical processes uncertainties, namely diffusion due to waveparticle interactions and radial diffusion, must be performed.Wave particle interactions must be known at all grid cells to reduce or even better eliminate bias reported in this study, and wave intensity distributions have to be assessed based on available wave intensity measurements in the inner magnetosphere.Radial diffusion model is also far from perfect and must be improved in the future.From the data point of view, realistic error distributions which may be different from instrument to instrument have to be assessed.Nevertheless, this study highlights the capabilities of such a data assimilation framework applied to radiation belts science.The reanalysis of data from multiple spacecrafts results in a significant reduction of the error compared to the ones obtained with a pure physical model.
Topical Editor R. Nakamura thanks two anonymous referees for their help in evaluating this paper.

Fig. 1 .
Fig. 1.Distribution of the optimal time step that can be used in the Salammbô code.The statistic has been performed over more than one solar cycle, from 1990 to 2005.

Fig. 2 .
Fig. 2. Mean ratio over the full simulation grid of the PSD deduced at 0.1 or 10 s time step to the "reference" PSD performed at 1s time step.
indicates the three spacecraft orbits.The dark blue envelope indicates the spatial outer boundary of the simulation domain and the light blue indicates the magnetic equator in an

Fig. 3 .
Fig. 3. Orbits along with data sets are considered for the simulation as of 5 October 1990.Data assimilated are taken along LANL GPS NS18 (red) and LANL GEO 1989 046 (yellow).Test data set is taken along CRRES spacecraft (green).

Fig. 4 .
Fig. 4. A schematic view of a sequential data assimilation approach where IPODE is the Ionising Particle ONERA DatabasE and Salammbô 3-D the radiation belt physical model.

Fig. 5 .Fig. 6 .
Fig. 5.A schematic view of the equivalence of the Kalman filter and the ensemble Kalman filter.

Fig. 7 .
Fig. 7. Average electron flux at geostationary orbit in term of energy and local time map for two magnetic activity index (left, Kp = 1 and right, Kp = 6).

Fig. 9 .Fig. 10 .
Fig. 9. Root mean square deviation of the logarithm of phase space densities averaged over the entire simulation domain (E, α eq , L * ) for various size of ensemble (top panel), and magnetic activity index Kp (bottom panel).

Fig. 11 .
Fig. 11.Mean ratio over the full simulation grid of the mean predicted PSD by the ensemble to the "true" PSD versus time in an identical-twin configuration.Comparison when a non biased or biased assimilating model is used.

Fig. 12 .
Fig. 12.Comparison of mean prediction of the ensemble Kalman filter where LANL-GEO and GPS ns18 data are assimilated, with measurements along GEO orbit, for 0.315-0.5MeV (top panel) and 1.1-1.5 MeV (middle panel) electrons.Black curve is the observation, green curve is the mean ensemble Kalman filter prediction and red curve is the pure Salammbô prediction.The magnetic activity index Kp is plotted in bottom panel.

[Fig. 13 .
Fig. 13.Error distribution (log (observations/mean predictions)) (top panels), the errors (observations/mean predictions) versus observation values (middle panels) and errors (observations/mean predictions) versus L * (bottom panels), with on the left column electrons with 0.33 MeV energy and on the right column electrons with 1.58 MeV energy along CRRES orbit.