Stochastic Maximum Likelihood (SML) parametric estimation of overlapped Doppler echoes

This paper investigates the area of overlapped echo data processing. In such cases, classical methods, such as Fourier-like techniques or pulse pair methods, fail to esti- mate the first three spectral moments of the echoes because of their lack of resolution. A promising method, based on a modelization of the covariance matrix of the time series and on a Stochastic Maximum Likelihood (SML) estimation of the parameters of interest, has been recently introduced in literature. This method has been tested on simulations and on few spectra from actual data but no exhaustive investigation of the SML algorithm has been conducted on actual data: this paper fills this gap. The radar data came from the thunder- storm campaign that took place at the National Astronomy and Ionospheric Center (NAIC) in Arecibo, Puerto Rico, in 1998.


Introduction
Wind-profiler radars are an instrumentation of interest because they could provide clear air and hydrometeor information.Generally, VHF radars are more adapted to detect clear air echo, whereas UHF radars are more sensitive, when they are present, to hydrometeor (Rayleigh scattering) than to clear air (Bragg scattering).Nevertheless, both clear air and hydrometeor echoes may be present in the spectrum leading to very complex spectra with many numbers overlapped echoes.In order to take advantage of all the information included in the backscattered signal, new algorithms have to be tested.
In the case of complex spectra, periodogram-based techniques provide poor results, since the frequency resolution is not enough to separate the different contributions.As a Correspondence to: E. Boyer (boyer@satie.ens-cachan.fr) consequence, even with a very sophisticated algorithm devoted to peak identification in a spectrum, it may be difficult to separate overlapped echoes, especially when their amplitudes are very different.To overcome these difficulties, we therefore proposed in a recent paper (Boyer et al., 2001) the use of model-based spectral methods for the parametric estimation of different main frequency components present in the signal.In case of echoes with a small standard deviation, we obtained good results with the MUSIC algorithm (Bienvenue and Kopp, 1979;Schmidt, 1986), based on a parameterization of the autocorrelation function of time series.However, this method doesn't provide the two other spectral moments of the echoes (the zeroth moment, related to the turbulence intensity or hydrometeor characteristics and the second moment, related to the velocity dispersion) and is no more efficient in case of echoes with a large standard deviation.In order to solve this problem, we recently proposed (Boyer et al., 2003) the use of two methods based on a parametric modelization of the covariance matrix of the time series: the Stochastic Maximum Likelihood (SML) algorithm and the Weighted Pseudo-Subspace Fitting (WPSF) algorithm.We underscored in this paper the statistical superiority of the SML algorithm versus WPSF.Nevertheless, an exhaustive investigation of the SML algorithm on actual data has to be conducted in order to validate this method: this paper fills this gap.Furthermore, the validity of the results has to be tested.
The paper is organized as follows.In Sect.2, we briefly present the SML method and the underlying model.We also present some simulation results that show the great potential of the method.In Sect.3, implementation aspects of the method are discussed and the SML algorithm is exhaustively tested on actual data.The radar data came from the thunderstorm campaign that took place at the National Astronomy and Ionospheric Center (NAIC) in Arecibo, Puerto Rico, in 1998 (Petitdidier et al., 2000).The algorithm is successively applied on UHF and VHF time series, and then the results compared.We propose in Sect. 4 an interpretation of the results and conclude on the interest of the method in Sect. 5.

Signal model
The SML method is based on the following hypothesis: -We make the assumption that the N echoes contained in the power spectrum P S (f ) of the time series x(t) are Gaussian.We discussed the validity of this hypothesis in Boyer et al., 2003.We then have where σ 2 n is the power of an additive white Gaussian circular noise, β is a Gaussian random variable with unit standard deviation, and where P i , f i and σ i are the three unknown spectral moments of the ith Gaussian echo.
-Denoting x(k) the time series vectors of dimension m×1 obtained from independent recording data, x(k) satisfies where y is a Gaussian centred stochastic process, and where n is a temporally white complex Gaussian process independent of y with power σ 2 n .Consequently, x is a Gaussian stochastic vector with mean zero and covariance matrix Based on these two hypotheses, R x can be written: where where j 2 =−1, I is the identity matrix, T S the pulse repetition time of the radar, (.) * denotes the conjugate transpose and where µ is parameter of dimension 3N +1 to be estimated:

The SML algorithm
The Maximum Likelihood (ML) estimates μ of µ are calculated as the values of µ that maximize the likelihood function, i.e. selecting the set of parameters that makes the observed data most probable (Sharf, 1991;Haykin et al., 1993), or equivalently to minimize the negative log-likelihood function L (µ), with (Bang, 1971) and where the notations log (.), |.| and Tr(.) denote the natural logarithm, the matrix determinant and the trace operator.
Rx is the sample covariance matrix The SML algorithm is initialized at different values µ 0 p (1≤p≤p max ) of the parameter vector space and is optimized with a second-order steepest descent method (Bard, 1974) given by where ∇ and H are the gradient and the Hessian, whose expressions are given Eqs.( 15) and ( 16) in Boyer et al. (2003), and ε k <1, ε k being a step length.The diagram of the algorithm is represented in Fig. 1.At each iteration, L μk is computed.The calculation of L μk requires the inversion of matrix R x μk and H μk which is very time consuming.

Results of simulations
The performance study of the algorithm has been developed in Boyer et al., 2003.We especially showed in this paper the statistical superiority of the SML estimator versus WPSF.We also presented some estimations of spectra composed of one or two overlapped Gaussian echoes.In order to show the good behaviour of the SML estimator in cases of multiple overlapped echoes, Fig. 2 investigates the simulated case of 3 strongly overlapped Gaussian echoes (P 1 =10W , f 1 =−0.08,σ 1 =0.05,P 2 =14W , f 2 =0, σ 2 =0.01,P 3 =5W , f 3 =0.06,σ 3 =0.02).The reconstructed SML spectrum, which is in very good adequacy with the FFT spectrum, corresponds to the following estimates: P1 =11.9W , f1 =−0.057, σ1 =0.056, P2 =12.8W , f2 =−0.003, σ2 =0.008, P3 =2.8W , f3 =0.056 and σ3 =0.022.at UHF (430 MHz) and VHF (46.8 MHz) with a 2 µs pulse, the inter pulse period (IPP) being 1ms.The first 50 gates are consecutive and their sampling starts at 40 µs; the last ten gates correspond to the calibration; the sampling of this group starting at 900 µs.The in-phase and in-quadrature data are recorded without any coherent integration.The UHF and VHF radars have collinear beams.They sampled simultaneously quasi the same volume.The UHF radar operated in the near-field while the far-field of the VHF radar started at 12.5 km.The three-decibel beam radius was 0.085 • UHF and 0.95 • at VHF.We hoped that the simultaneous observation at 46.8 MHz and 430 MHz would offer some clues to the hydrometeor characteristics.

ˆˆˆˆ with min
Several other instruments, a disdrometer and an electric field mill, were deployed on the same site or located at another place on the island, as well as a Doppler weather radar, NexRad, and a radio sounding.
In order to test the method used to retrieve echo characteristics from the signal, the observations taken on 8 October 1998 were chosen for the following arguments.There were only seven thunderstorms passing over the Observatory with noticeable rainfall.Out of the 7 cases, during only one event, both UHF and VHF radars were working at the time when the thunderstorm was passing over them.It is also obviously the best day during the campaign.The drawback is that there were no data from the Doppler Weather radar.As a consequence, it is not possible to determine the cloud spatial characteristics.

Implementation of the method
In order to compare power spectra (UHF or VHF spectra, estimated with FFT or SML algorithms), it is necessary to take the same number of samples N spectrum =N ff t .N coh .N inc per spectrum for all cases.N ff t is the number of FFT samples, N coh and N inc are, respectively, the number of coherent and incoherent integrations.For UHF data, we chose N ff t =256, N coh =4 and N inc =16.For VHF data, we chose N ff t =256, N coh =32 and N inc =2.
For the SML algorithm, we successively tested on UHF and VHF data the two following assumptions: N =1 (spectrum supposed to contain one Gaussian echo) and N=2 (supposed to contain two Gaussian echoes).As explained before, the SML algorithm is initialized at different values of the parameter vector space.In order to minimize the time cost of the method, we minimized the number of initializations by a good choice of parameter values µ 0 p .
-We noticed that the initialization is not sensitive to the choice of the power of the noise or of the echoes.For that reason, the noise power σ 2 n is initialized with an estimate σ 2 n given, for example, by the segment method (Petitdidier et al., 1997).For both cases (N=1 or N=2), the total power of sources Ptot is estimated from the autocorrelation function.
-Frequencies f i are initialized at values in adequacy with the FFT spectrum.
We then have the two following initializations: For N =1, the parameter vector µ is initialized with the values µ= Ptot fmax σ 2 1 σ 2 n , where fmax corresponds to the frequency of the maximum of the averaged FFT power spectrum (low pass filtered power spectrum).σ 1 is initialized with the different following values [0.1 : 04 : 4 6 8] (m/s).
We also took the following values: k max =15, k max =15 and m=16 (R x of dimension m×m ).

Data
Among the observations, we select the time period corresponding to a thunderstorm passage.The day of 8 October was composed of several cells of rain with the principal activity occurring between approximately 16:20-17:15 AST.For UHF and VHF data, only the data recorded between 16:58 and 17:15 AST have been treated (91 profiles, a profile corresponding to a recording of 16.384 s): it corresponds to the most interesting part of the thunderstorm passage.During the campaign, the antenna was pointed vertically.
Bragg scattering from variations in the atmospheric index of refraction is the primary scattering that allows the Strato-Tropospheric radars to measure winds in the clear air.Nevertheless, such radars are also sensitive to Rayleigh scattering from hydrometeors.At UHF, while it is cloudy the main echo is due to the backscattering by the hydrometeors (Larsen and Röttger, 1987;Gossard, 1988).At VHF, the main echo is due to backscattering by turbulent inhomogeneities; nevertheless, hydrometeor echoes are observed in case of strong rain (Wakasugi et al., 1987).Another important contribution to the VHF signal is the lightning radiation that is not so important at UHF.These results corroborate many other experimental results (Pierce, 1977).
In order not to saturate the receiver, the attenuation used during lightning events may be important, as a consequence the VHF atmospheric signal shall be strongly attenuated and not detected because its amplitude becomes under the detection threshold.
Atmospheric contribution is deduced from the VHF data in the zone where the lightning contribution does not hide the clear-air contribution.As a matter of fact, the lightning radiation may bias the spectrum without providing more information, especially at high altitudes where the backscattered signal is weak.A data reduction method has been developed, which permits one to decrease the contribution of the lightning radiation.The signal of lightning radiation is characterized by a large amplitude and variance.Most of the time the signal is saturated while the actual atmospheric contribution is weak relatively to the lightning radiation contribution.The number of coherent integration, such as 32 at VHF, decreases the effect of the lightning contribution but too many spikes still exit and bias the spectrum.The method used consists of several steps.During the first step the zones affected by lightning radiation signal are located, the criteria being the amplitude and power variance values.Then the following treatment is applied.
Standard deviation and mean amplitude are computed over non-overlapping blocks of k terms of the real and inquadrature time series, with k being equal to 10 in this case.The threshold criteria, to determine if there is contamination by lightning radiation, are applied on the standard deviation and mean amplitude.The temporal extent of the polluted zone is determined by comparison of consecutive values above the threshold.Once the limit of each polluted zone is determined the mean amplitude variation in this zone is constrained to a limited range of values, in order to carry out the first decimation of the lightning contribution.A spline function is then used to interpolate data into the polluted zone.The last step consists of superimposing on the interpolated time series the additive noise taken from the nearest not "polluted" time series.The signal coming from the ionised channel backscattering is not withdrawn.The vertical velocity is therefore obtained with a minimum bias.Figure 3 presents the signal amplitude of the real part of a time series sample polluted by lightning before and after the treatment.
The SML algorithm was applied all over the data file on both UHF and corrected-VHF time series.We successively tested for both UHF and VHF data the two hypotheses N =1 echo and N =2 echoes, whose different results are presented in Figs. 4 to 6.We discuss in the following section the results obtained in each case.

Air motion and hydrometer separation
The observations analyzed concern a thunderstorm passing over the radars at UHF and VHF.If we assume N =1 echo, in cloudy conditions at UHF, the retrieved echo characteristics correspond to the backscattering by hydrometeors; at VHF, the echo characteristics correspond to backscattering by turbulent inhomogeneities; nevertheless, hydrometeor echoes are observed in case of strong rain.The estimated vertical velocities are presented in Fig. 4.
If we assume N =2 echoes, the two estimated echoes (in case of the convergence of the SML estimator) have in a second step to be correctly attributed to air turbulence or echo hydrometeors.
Two sets of plots are then given, one corresponding to the vertical velocities of hydrometeors and the second one to the air vertical velocities.
Several methods have been implemented to separate the air motion and hydrometeor contributions in vertical incident profiler observations (Williams et al., 2000;Ralph et al., 1996).They were applied mainly in stratiform conditions with rain below the melting layer, snow or ice crystals above, and clear air turbulence at upper altitudes.In these cases, velocity thresholds, altitude dependent, are used to separate air and hydrometeor motion.Better results are obtained with the cluster method.The separation between the two scattering processes is done by using two parameters, reflectivity and Doppler velocity.In the case of convection the reflectivity and variance are large due to the turbulence (Ralph et al., 1996).
The observations, presented and taken above the freezing level, concern a thunderstorm with large updraft and downdraft (cf.Fig. 4).The turbulence is important in such clouds since then the backscattering power and the spectral width attain large values.Chilson et al. (1993) reported similar observations at Arecibo.The authors made an attempt to characterize the hydrometeors from their fall velocity.They consider that there are graupels or super-cooled water or both.Meteorological considerations prioritize one kind with regard to the other.
The criteria used by Williams et al. (2000) or Ralph et al. (1996) cannot be used.The range of velocities is similar for air and hydrometeor motions and a large value of their reflectivity and spectral width are observed.Then the criteria proposed by Williams et al. (2000) or Ralph et al. (1996) failed due to the difference in meteorological conditions and then in echo characteristics.A simple criterion based on the vertical velocity is used to separate at UHF, as well as at VHF data, the wind echoes from the hydrometeor echoes; the hydrometeor velocity being always less than the air motion.

Cloud description
Figure 4 provides the variation of the vertical radial velocity retrieved from VHF and UHF signals as a function of time if one echo is assumed at each frequency.The range resolution is 300 m, the time resolution is 16.384 s.The positive values of the velocity correspond to upward motion.
At VHF a strong bent upward air motion is observed and after the profile 60 downward motions appear below 9 km, which is typical of thunderstorms.At UHF upward hydrometeor motion (air velocity + fall speed) is observed with lower values and the downward motion is more important due to the hydrometeor fall speed.As there were no measurements below 6 km the feeding of the cloud, its base and the rainy part cannot be described.
On Figs. 5 and 6, the results obtained for UHF and VHF radars are presented, assuming in the backscattering signal 2 echoes.Figures 5a and 6a represent the velocity attributed to the wind vertical velocity, the lower plots (Figs.5b and 6b) to the ones attributed to the hydrometeor velocity.First of all, on all the plots the cloud structure is fairly the same: large updraft with downdraft below the anvil.Nevertheless, many differences exist.In Figs.5a and b, the downward motions are larger in air and hydrometeor.In Fig. 6a, the updraft has higher values than in Figs.4a or 5a. Figure 6b presents many spurious values, i.e. very different from the neighbours.
At VHF assuming two echoes in the signal above 9 km the algorithm does not converge; it does converge if one echo is assumed.That explains the limitation in altitude of Figs.6a  and b.As noted in Sect.4.1, it is reasonable in the way that in most meteorological conditions at VHF the only contribution comes from the Bragg scattering by turbulent inhomogeneities; its efficiency being very large with regard to the efficiency of the Rayleigh scattering by hydrometeors.Below the freezing level, two echoes occur for heavy rain, above it two echoes occur where the contribution of snow, graupel, super cooled water or a mixture of them become very large.

Results analysis
From the air and hydrometeor vertical velocities the fall speed of the hydrometeor is deduced.The fall speed permits one to characterize the hydrometeor faced with.Then a good accuracy of the fall speed implies a good accuracy of the air and hydrometeor velocities.
This part is therefore an attempt to quantify and understand the differences observed on one hand among the three estimates of the air vertical velocities and on the other hand among the three estimates of the hydrometeor vertical velocities.
The first step is to identify the zones where the disagreement between two estimates of the same parameter is important.Figure 7a plots the difference between VHF and UHF air velocities and Fig. 7b plots the difference between UHF and VHF hydrometeor's velocities, assuming N =2 echoes.
The first remark is that there is a big difference (around 8 m/s) between the estimation of VHF and UHF air velocities for profiles 30-90.There is also a big difference (around −8 m/s) between the estimation of VHF and UHF hydrometeor's velocities for profiles 55-90.
For the estimation of hydrometeor's velocities at VHF, the reason is that after profile no.55, the hypothesis N =2 for VHF spectra is not always satisfied, as illustrated by Fig. 8 where different UHF and VHF spectra are presented together with the estimation of the echoes (the algorithm has not yet finished the selection between the air and hydrometeor echoes).On VHF spectra the presence of an additional VHF hydrometeor echo is not obvious.Except for profile no.71, where we can notice the presence of a symmetric echo for the VHF profile no.71, the same phenomena are observed on UHF spectra.Furthermore on UHF spectra, spurious contributions are also observed.Now let us consider the zone of the strong updraft.On Fig. 9 two echoes are clearly observed at VHF (the algorithm has not yet finished the selection between the air and hydrometeor echoes).Indeed, the presence of two echoes in that case is obvious and the SML estimation of the echoes, seems to be done correctly.At UHF the situation is more complex due to the presence of strong and wide atmospheric echoes, of symmetric attenuated echoes of the atmospheric echoes as observed on Fig. 8, and spurious symmetric echoes around 30-35 m/s.Since the SML estimation is retrieved from the time series, the bias observed on the UHF spectrum and due to the FFT computation does not exist.
Nevertheless, the contribution of the different frequencies is present in the signal.In the initialisation step two echoes are assumed to be present in the signal, and the frequency of the spurious echoes are excluded due to unrealistic vertical velocities.The algorithm provides a relatively good fit of the spectrum but cannot indicate that more echoes are present.In order to improve the estimations of both echoes the SML algorithm is initialized with a thinner grid of the parameter vector space.The SML algorithm converges to the same values.This case, several close wide echoes, points out one limit of the SML algorithm due to an a priori assumption about the number of echoes present in the signals.As a consequence in this updraft part, at VHF both echoes are well estimated and at UHF only the hydrometeor velocity is estimated due to the complexity of the signal.Even if the number of echoes is increased in order to take into account symmetric echoes with attenuation, we cannot assert that the air contribution is detectable.

Conclusion
In this paper, after a brief presentation of the SML algorithm, we applied this method to UHF and VHF data, in order to estimate both wind and hydrometeor contributions in each frequency data set.The case chosen to test the reliability of the algorithm is a severe one, a tropical thunderstorm.
We reached the following conclusions: -The method used to localize and retrieve, or at least decrease substantially the lightning contribution, is easy to implement and efficient.-The SML algorithm is efficient to identify 2 echoes in complex spectra as soon as the hypothesis of the number N of echoes is correct.Since N is unknown, we successively tested the two hypotheses, which allowed us to have a better understanding of whether the two echoes are present or not.
-When there are too many wide echoes that overlapped with two echoes or two symmetrical echoes, a relatively good fit of the spectrum is obtained but one of the echoes has no physical meaning.In order to eliminate it a posteriori, some criteria have to be applied.
-The estimation might be significantly upgraded by taking into account the presence of attenuated symmetric echoes.
-In the case studied in this paper, hydrometeor echoes were quite well approximated by a Gaussian shape due to turbulence.This is not always the case, so we intend to test the robustness of the algorithm in the case of non Gaussian echoes.
-However, despite the complexity of the present case, we provide a consistent description of the thunderstorm.The final air and hydrometeor maps will be obtained by applying criteria to eliminate the non physical echo and then selecting in each data set the most probable echo value.
As a conclusion, in order to improve the estimation of the UHF and VHF echoes, we should adapt the algorithm so that it is not dependent on the number of echoes present in spectra.This is an ongoing work.

3
Application on a case study 3.1 Introduction A campaign took place at NAIC in Arecibo (Puerto Rico) to study thunderstorms passing over the site from 15 September until 15 October 1998.The main instruments were the Strato-Tropospheric UHF and VHF radars.During this experiment, the measurements were carried out simultaneously Initialization of p max parameter vectors

Fig. 8 .
Fig. 8. SML estimation of air and hydrometeor echoes for a few profiles located in the strong downdraft area (the algorithm as not yet affected the type of the echoes).

Fig. 9 .
Fig. 9. SML estimation of air and hydrometeor echoes for a few profiles located in the strong updraft area (the algorithm as not yet affected the type of the echoes).