Raindrop size distribution variability estimated using ensemble statistics

Before radar estimates of the raindrop size distribution (DSD) can be assimilated into numerical weather prediction models, the DSD estimate must also include an uncertainty estimate. Ensemble statistics are based on using the same observations as inputs into several different models with the spread in the outputs providing an uncertainty estimate. In this study, Doppler velocity spectra from collocated vertically pointing profiling radars operating at 50 and 920 MHz were the input data for 42 different DSD retrieval models. The DSD retrieval models were perturbations of seven different DSD models (including exponential and gamma functions), two different inverse modeling methodologies (convolution or deconvolution), and three different cost functions (two spectral and one moment cost functions). Two rain events near Darwin, Australia, were analyzed in this study producing 26 725 independent ensembles of massweighted mean raindrop diameter Dm and rain rateR. The mean and the standard deviation (indicated by the symbols 〈x〉 andσ {x}) of Dm andR were estimated for each ensemble. For small ranges of 〈Dm〉 or 〈R〉, histograms ofσ {Dm} and σ {R} were found to be asymmetric, which prevented Gaussian statistics from being used to describe the uncertainties. Therefore, 10, 50, and 90 percentiles of σ {Dm} and σ {R} were used to describe the uncertainties for small intervals of 〈Dm〉 or 〈R〉. The smallest Dm uncertainty occurred for 〈Dm〉 between 0.8 and 1.8 mm with the 90th and 50th percentiles being less than 0.15 and 0.11 mm, which correspond to relative errors of less than 20% and 15%, respectively. The uncertainty increased for smaller and larger 〈Dm〉 values. The uncertainty of R increased with〈R〉. While the 90th percentile uncertainty approached 0.6 mm h −1 for a 2 mm h−1 rain rate (30% relative error), the median uncertainty was less than 0.15 mm h −1 at the same rain rate (less Correspondence to: C. R. Williams (christopher.williams@colorado.edu) than 8% relative error). This study addresses retrieval error and does not attempt to quantify absolute or representativeness errors.


Introduction
The assimilation of radar precipitation estimates into numerical weather prediction models is a very difficult task because the numerical models require both the precipitation estimate as well as the uncertainty of that estimate in order to blend the observations with the model.Quantifying the precipitation uncertainty from radar observations is also difficult because the uncertainty results from four types of errors: measurement, model, representativeness, and sampling (Bringi and Chandrasekar, 2001).Measurement errors are due to the precision of the instrument.Model errors result from representing observations with idealized mathematical expressions.Representativeness errors are due to time evolving changes and spatial inhomogeneity of precipitation within the sample volume during the observation dwell time.Sampling errors result from changes in precipitation between successive observations.
This study focuses on quantifying the model errors of precipitation estimates retrieved from vertically pointing profiling radars by using the concept of ensemble statistics.The underlying principle of ensemble statistics is that the same radar observations are the inputs into multiple models and the range of output solutions determines the uncertainty of the model estimate.When different observations are used as inputs to the same models (for example, observations from two different profiling radars), differences in the model estimates will be due in part to the representativeness of the Published by Copernicus Publications on behalf of the European Geosciences Union.input observations.By using the same inputs in every model, differences in the output are due to assumptions about the model precipitation physics and due to the numerical code used in the retrieval process.
Vertically pointing profiling radars have been used for over 20 years to estimate the number and size of raindrops falling directly overhead (Wakasugi et al., 1986).Three major modeling factors determine how the radar observations are converted into precipitation estimates.The first major factor is the mathematical functional shape of the raindrop size distribution (DSD).Typically, there are more small raindrops within a given volume than large raindrops, which leads to the assumption that the shape of the DSD follows an exponential (Waldvogel, 1974) or a gamma function (Ulbrich, 1983).By assuming a particular shape of the DSD, errors are added to the retrieved DSD because the unknown true distribution of raindrops may not follow the assumed shape.This study uses seven different DSD shape models previously discussed in the literature (Waldvogel 1974;Ulbrich, 1983;Marshall and Palmer, 1948;Illingworth and Blackman, 2002;Zhang et al., 2003;Feingold and Levin, 1986).
The second major factor that contributes to the model error of precipitation estimates retrieved from vertically pointing profiling radars is the numerical inverse methodology that converts the radar observations into raindrop size distribution estimates.If the DSD were known a priori, then the radar observations can be uniquely determined using radar scattering theory.This forward modeling maps the DSD into the radar domain.However, converting radar observations into the DSD domain is an inverse modeling problem and there is not a unique mapping from a given radar observation into a unique DSD.This study uses both the convolution and deconvolution numerical inverse modeling methodologies to estimate the DSD given a set of radar observations (Schafer et al., 2002;Lucas et al., 2004).
The third major factor contributing to model error is the cost function that objectively determines the "best" solution when comparing the model with the observed radar observation.The most commonly used cost function involves the sum of the squared difference between the model and observation.This cost function is also related to the chi-squared (χ 2 ) statistic.Another cost function involves the absolute difference between the model and observation and is a better cost function to remove the influence of outliers.This study uses these two cost functions plus a third that compares the first three moments of the modeled and observed radar data and is a more efficient calculation than the first two cost functions.
Using seven DSD models, two numerical inverse modeling methods, and three cost functions yields 42 DSD estimates for each radar observation.These 42 DSD estimates constitute one ensemble.The profiling radar observations from collocated 50-and 920-MHz profilers near Darwin, Australia, during two rain events are used in this study to estimate the mean mass-weighted raindrop diameter and the rain rate.The statistics of 26 725 independent ensembles are analyzed to provide uncertainty estimates that can be applied to each precipitation estimate.
This paper has the following format.The seven different DSD models are discussed in Sect. 2. The forward model of estimating the radar reflectivity-weighted Doppler velocity spectra when given a raindrop size distribution is discussed in Sect.3. The inverse model methodologies are presented in Sect.4, followed by the discussion of the cost functions in Sect. 5.The profiling radar observations are discussed in Sect.6.The ensemble statistics and conclusions are presented in Sects.7 and 8.

DSD models
The number and size of raindrops within a unit volume is described by the number concentration, N (D) [number m −3 mm −1 ], also called the raindrop size distribution (DSD), where D is the spherical equivalent diameter of each raindrop [mm].Given the number concentration, several quantities describing the precipitation can be estimated, including the radar equivalent reflectivity factor, z [mm 6 m −3 ].Assuming Rayleigh scattering, z is estimated using (Doviak and Zrnic, 1993) (1) The reflectivity factor can be expressed in log units [dBZ] using Z = 10 log 10 (z). (2) The rain rate, R [mm h −1 ], is estimated using where v(D) is the terminal fall speed of the raindrop expressed in m s −1 , and leading constants scale the rain rate so that it is expressed in mm h −1 .Another useful parameter used to describe the DSD is the mass-weighted mean drop diameter which is estimated using As can be seen from Eqs. (1) through (4), the DSD can be described in detail using N (D) or described in general using Z, D m , and R. While it would be useful to have N (D) estimates for every raindrop diameter size, noise and measurement uncertainty prevents N (D) from being estimated at each raindrop size.Therefore, N (D) is described using mathematical expressions that are functions of diameter, and the following subsections describe the seven DSD models used in previous work and in this study.

Gamma distribution
The work by Ulbrich (1983) described the DSD using a modified Gamma function of the form where µ [unitless] is the shape parameter, and [mm −1 ] is the slope parameter which is related to the mean diameter using =(4+µ)/D m .While µ does influence the slope of the distribution at large diameters, µ has a large influence on the curvature of the distribution at small diameters.When µ has negative values, the number concentration increases as the diameter decreases, and mathematically (and nonphysically) has infinite number of drops with zero diameters.Conversely, when µ has positive values the number concentration decreases as the diameter decreases, causing a downward curvature of the number concentration at small raindrop sizes.

Exponential distribution
The exponential distribution has been used in many studies to describe the DSD before Ulbrich (1983) introduced the Gamma distribution DSD model (Waldvogel, 1974).The exponential DSD is a special case of the Gamma distribution DSD when µ=0 and is expressed as

Marshall-Palmer distribution
In the seminal work by Marshall and Palmer (1948), the DSD was described by the set of equations where N 0 MP is the Marshall-Palmer scale parameter and MP is the Marshall-Palmer slope parameter which is a function of rain rate.The Marshall-Palmer (MP) distribution is a special set of the exponential distribution DSDs constrained to have a fixed scale parameter (N 0 MP =8000) and a slope parameter dependent on the rain rate.The MP distribution was developed using mid-latitude stratiform rain and it will be shown in Sect.7 that the MP distribution is not well suited to describe the tropical rainfall data set used in this study.

Constant µ gamma distribution
As previously discussed, the value of µ in the gamma distribution has a large influence on the shape of the DSD at small raindrop sizes.The work by Illingworth and Blackman (2002) suggests that radars that observe the raindrops within the Rayleigh scattering regime can not resolve the small raindrops, and a fixed value of the shape parameter is appropriate for describing the DSD when using weather radars.In this study, retrievals are performed using the gamma distribution (Eq.5) with µ set to constant values of 2.5 and 5.A constant µ reduces the Gamma function DSD (Eq.5) to two unknowns.

Constrained gamma distribution
The gamma distribution expressed in Eq. ( 5) consists of 3 different variables, N 0 , µ, and D m .The work by Zhang et al. (2003) and Brandes et al. (2003) suggests that a mathematical relationship exists between µ and in the gamma distribution DSD model.While the particular µ− relationship may be precipitation regime-dependent and more work is needed to validate these relationships, this study uses the Zhang et al. (2003) This µ− relationship was converted into a µ−D m relationship using =(4+µ)/D m yielding

Log-normal distribution
The raindrop size distribution has been described by Feingold and Levin (1986) using a log-normal distribution of the form where N t is the total number of drops per unit volume [count m −3 ] and σ describes the width of the distribution.
A unique feature of the log-normal distribution is that N (D) approaches zero as the raindrop diameter approaches zero.

Radar observations of DSD
Assuming a radar beam is pointing vertically and assuming the raindrop size distribution is known and uniformly distributed throughout the radar pulse volume, radar backscattering theory is used to construct two mathematical estimates of the radar observations (see Doviak and Zrnic, 1993, for details of the radar backscattering theory).The first mathematical estimate is from an ideal radar that has infinitesimal beamwidth and the raindrops are in a static atmosphere without any vertical motion and without any turbulence.While this ideal radar and atmosphere does not exist, this mathematical framework is useful to perform the coordinate transformation from the number concentration's raindrop diameter domain to the radar's raindrop fall-speed domain.The second mathematical estimate includes the finite beamwidth of a realistic radar along with the vertical motion and turbulence of a realistic atmosphere.These three factors contribute to spreading the returned power from each raindrop into several different velocity channels.Both mathematical estimates are discussed in more detail below.

Ideal radar
Assuming a perfect Doppler radar with infinitesimal beamwidth observing a uniformly distributed raindrop size distribution N (D) in a static atmosphere without any vertical air motion and without any turbulent motion, the modeled hydrometeor reflectivity-weighted Doppler spectral density, through the relation (Atlas et al., 1973) where v and dv are the velocity channels and velocity resolution of the Doppler velocity spectrum in units of m s −1 .The variables D and dD are the raindrop diameters and diameter resolutions corresponding to v and dv and have units of mm.The units of S hydro (v) are reflectivity per velocity channel (mm 6 m −3 ) (m s −1 ) −1 and S hydro (v)has non-zero values only in the velocity channels with corresponding raindrops.
While dv has the same value for each velocity channel, dD is variable and dependent on the diameter D. Through laboratory studies, the terminal fall speed of raindrops is expressed as where ρ 0 and ρ represent the air densities at the ground and the level of the observation aloft, respectively (Gunn and Kinzer, 1949;Atlas et al., 1973).

Realistic radar
While Eqs. ( 11) and ( 12) describe the reflectivity-weighted Doppler velocity spectral density for an ideal radar observing any possible raindrop size distribution N(D) in a static atmosphere, finite radar beamwidth and atmospheric vertical air motion and turbulence need to be added to the radar forward model to better represent radar observations.Both the finite radar beamwidth and atmospheric turbulence cause the observed Doppler velocity spectrum to be spread over a wider range of velocity channels.It is also important to include the shift in the Doppler velocity spectrum due to the vertical air motion that shifts the raindrop terminal fall speed to the observed Doppler velocity.The spreading and shifting of S hydro (v) is accomplished by convolving S hydro (v) by the spreading and shifting spectrum where S air (v − ω Doppler , σ air ) is Gaussian shaped (Gossard, 1994), ω Doppler [m s −1 ] is the Doppler velocity of the ambient air motion defined with motions approaching the radar as positive Doppler motions consistent with the 1842 work by Christian Doppler (White, 1982), and σ air [m s −1 ] represents the spreading of the spectrum.The leading fraction in Eq. ( 13) normalizes S air (v−ω Doppler , σ air ) to unit area when integrated over all velocities so that the spectral broadening does not modify the total reflectivity of S hydro (v).The convolution of S hydro (v) by S air (v−ω Doppler , σ air ) is expressed mathematically as (Wakasuki et al., 1986) where the symbol ⊗ represents the convolution function.
The last term in Eq. ( 14) is the random noise that is radar dependent and must be added to every Doppler velocity channel of the Doppler spectrum.Equation ( 14) defines the forward model of a realistic radar and produces a realistic reflectivity-weighted Doppler velocity spectrum when the raindrop size distribution N (D), the air motion Doppler velocity ω Doppler , and the spectral broadening σ air are used as inputs.

Numerical inverse model methodologies
While Eq. ( 14) constructs a modeled radar reflectivity Doppler velocity spectrum S model (v), the goal of DSD retrievals is to estimate the raindrop size distribution N (D) when the radar observes a Doppler velocity spectrum S obs (v).If the retrieved model spectrum S model (v) approximates S obs (v) by minimizing a cost function (described in Sect.5), then N (D) can be estimated from S model (v).One difficulty with solving this inverse problem is accounting for the convolution of S hydro (v) by the spreading and shifting spectrum S air (v−ω Doppler , σ air ).Two methods have been discussed in the meteorological literature to account for the convolution operation.The oldest method uses stable convolution calculations to estimate S model (v) and the newest method uses numerical deconvolution techniques to remove the influence of the spreading spectrum to estimate S hydro (v).In both methods, N (D) is adjusted until a cost function is minimized.Details of both methods are described below.

Convolution method
The convolution method uses the forward model described by Eq. ( 14) to estimate S model (v) and iteratively adjusts N (D) until a cost function between S model (v) and S obs (v) Ann.Geophys., 27,[555][556][557][558][559][560][561][562][563][564][565][566][567]2009 www.ann-geophys.net/27/555/2009/ is minimized.One advantage of this method is that calculating S model (v) is a stable numerical operation of the forward model using the convolution operation.While each forward model calculation is stable, it is possible without proper numerical coding for the solution to converge to a local minimum in the cost function and not the global minimum.The potential of finding a local minimum versus finding the global minimum is a trade-off between convergence speed and searching the whole solution space.In this study, the whole solution space is searched to find the global minimum to avoid the possibility of converging to a local minimum in the cost function.The convolution method has been used in many studies including Wakasugi et al. (1986Wakasugi et al. ( , 1987)); Sato et al. (1990); Currier et al. (1992); Maguire and Avery (1994); Ragopadhyaya et al. (1993,1998,1999); Schafer et al. (2002); and Williams (2002).
To illustrate the logic of the convolution method, Fig. 1 shows a flow diagram.The top portion (above the dashed line) of Fig. 1 shows the inputs into the convolution method which includes the DSD model, estimates of ω Doppler and σ air , and the observed radar Doppler velocity spectrum of the rain.Details of estimating ω Doppler and σ air from radar observations and the input rain spectrum are discussed in Sect.6.The bottom portion of Fig. 1 shows the flow diagram of the convolution method.For each retrieval, ω Doppler and σ air remain constant, so the spreading and shifting function S air (v−ω Doppler , σ air ) needs to be calculated only once using Eq. ( 13) (Fig. 1e).Starting with an initial N(D) (box d), the initial S hydro (v) is estimated using Eqs.( 11) and (12) (Fig. 1f) and then convolved by S air (v−ω Doppler , σ air ) to produce an estimate of S model (v) using Eq. ( 14) (Fig. 1g).This model spectrum is compared with the observed spectrum S obs (v) using one of three different cost functions (Eqs.16, 17, or 18) as discussed in Sect.5) (Fig. 1h).If this solution does not minimize the cost function (Fig. 1j), then N(D) is adjusted (Fig. 1i) and S hydro (v) is recalculated (Fig. 1f).The loop through Fig. 1i, f, g, h, j is repeated until the cost function is minimized.After finding the best solution, N(D) and estimates of R and D m using Eqs.( 3) and (4) (Fig. 1k) are saved for future analysis.The flow diagram is repeated using the same radar observations but different DSD functional shape (described in Sect.2) to yield 7 solutions for each cost function.

Deconvolution method
While the convolution method applies a spreading function to the ideal radar spectrum S hydro (v) to estimate a realistic radar spectrum S model (v) which is then compared with the observed spectrum S obs (v), the deconvolution method applies a "de-spreading" function to the observed spectrum S obs (v) to estimate a deconvolved spectrum S deconv (v) which is then compared with the model spectrum from an ideal radar S hydro (v).The new deconvolved spectrum is expressed as where S obs (v) is the observed Doppler velocity spectrum and the symbol ⊕ indicates the deconvolution operation.One major difficulty with numerical deconvolution operations is that the noise in the observed spectrum can be amplified, which could lead to unstable retrievals and unrealistic solutions.Studies by Lucas et al. (2004) and Schafer et al. (2002) provide two examples of performing stable deconvolution routines.
Figure 2 shows a flow diagram of the deconvolution method.The top portion (above the dashed line) of Fig. 2 shows the inputs into the deconvolution method which are the same inputs as for the convolution method, and the bottom portion shows the flow diagram of the deconvolution method.Since for each retrieval, ω Doppler , σ air , and S obs (v) remain constant, S deconv (v) needs to be estimated only once (see Fig. 2e and g).The iterative procedure starts with an initial estimate of N (D) (Fig. 2d) which is used to estimate S hydro (v) (Fig. 2f), and then compared with S deconv (v) using one of three different cost functions ( 16), (17), or (18) as discussed in Sect.5) (Fig. 2h).If this solution does not minimize the cost function (Fig. 2j), then N (D) is adjusted (Fig. 2i) and S hydro (v) is recalculated (Fig. 2f).The loop through Fig. 2i, f, h, j is repeated until the cost function is minimized.After finding the best solution, N (D) and estimates of R and D m using Eqs.(3) and (4) (Fig. 2k) are saved for future analysis.

Cost functions
In order to determine the "best" solution, cost functions are defined to subjectively compare the model spectrum with the observed spectrum.Three cost functions are used in this study.Two cost functions compare spectra at each velocity channel and one cost function compares three moments of the spectra.While the spectral cost functions produce model spectra that better represent the observed spectra, the moment cost function is computationally faster.All three cost functions are described below.

Spectral two-norm cost function
In this study, the spectral two-norm cost function is defined as the sum of the squared difference between the modeled and observed spectra at each velocity channel and is expressed as where i represents only the velocity channels with spectral values larger than the noise level.The value of J spectra is similar to a χ 2 estimate used by Sato et al. (1990) and Schafer et al. (2002).

Spectral one-norm cost function
The spectral one-norm cost function is defined as the sum of the absolute difference between the modeled and observed spectra at each velocity channel and is expressed as The numerical benefit of the one-norm cost function over the two-norm cost function is that outliers between the model and observation contribute less to the one-norm cost function.Thus, the one-norm cost function is a more robust cost function than the two-norm cost function (Aster et al., 2005).

Moment cost function
While the spectral cost functions involve every spectral point above the noise level, the moment cost function uses only the first three moments of the modeled and observed spectra and is expressed as where Z obs and Z model are the reflectivity [dBZ] (zeroth moment expressed in dBZ), V obs and V model are the reflectivity-weighted mean Doppler velocity [m s −1 ] (first moment), and σ V obs and σ V model are the reflectivity-weighted Doppler velocity standard deviation [m s −1 ] (square root of the second moment) for the observed and modeled spectra, respectively.Letting S(v) denote either the observed or modeled spectrum, the reflectivity in units mm 6 m −3 is estimated using and can be expressed in dBZ units using Eq. ( 2).The reflectivity-weighted mean Doppler velocity is estimated using (Williams, 2002) And the reflectivity-weighted Doppler velocity standard deviation is estimated using (Williams, 2002) The time-height cross sections of reflectivity for the two rain events on 22 and 23 January 2006 used in this study are shown in Fig. 3.Both rain events had radar brightband signatures near 4.5 km indicative of stratiform rain.Near 15:50 and 18:20 UTC on 23 January (Fig. 3b), the reflectivity structure did not contain a brightband as convective rain elements passed over the profiler site.
Examples of reflectivity-weighted Doppler velocity spectra observed by the two profilers while precipitation was directly over the profiler site on 23 January 2006 at 20:04 UTC are shown in Fig. 4. Figure 4a was derived from the 50-MHz profiler and Fig. 4b and c were derived from the 920-MHz profiler.The colored panels in Fig. 4a and b show the reflectivity-weighted Doppler velocity spectral density S 50−MHz (v) and S 920−MHz (v)=S obs (v) in units of 10 log 10 ((mm 6 m −3 )/(m s −1 )) at each range gate.The logarithmic scale is used to aid in visualizing data that spans six orders of magnitude.The 920-MHz profiler reflectivity is shown in Fig. 4c and has units of dBZ.
The advantage of using the 920-MHz operating frequency radar is that the radar is sensitive to backscattered energy from hard targets distributed throughout the radar pulse volume.Referring to the colored panel of Fig. 4b, the 920-MHz profiler Doppler spectra show particles with net downward motion below the freezing level, which is located around 4.5 km.The solid black lines in the colored panels indicate the air-density-corrected terminal fall speeds of raindrops with diameters of 1-, 3-, and 6-mm estimated using Eq. ( 12).The 50-and 920-MHz profilers operated with a coordinated scan strategy so that both radars were observing vertically for the first 45 s of every minute.The first valid range gate for the 50-MHz profiler was 1.5 km above the ground and each range gate was separated by 315 m.The 920-MHz profiler operated with 105 m range gate spacing, and the 25 range gates between 1.5 and 4 km were used in this study, which is high enough to have valid 50-MHz profiler vertical air motion estimates (1.5 km) and low enough to avoid the radar brightband (4.5 km).To account for the different vertical resolution of the two profilers, the 50-MHz profiler vertical air motion estimates at 315 m vertical resolution were interpolated to the 920-MHz profiler 105 m resolution.System parameters for both profilers are listed in Table 1.

Ensemble statistics
For each simultaneous 50-and 920-MHz radar observation, 42 different raindrop size distributions (DSDs) were estimated at each of the 25 range gates between 1.5 and 4 km.These 42 DSD estimates constitute one ensemble and were formed using seven different DSD models (see Sect. 2), two different numerical inverse model methodologies (see Sect. 4), and three different cost functions (see Sect. 5).In this study, all ensembles are studied independently of altitude, time, and rain regime to evaluate the statistical properties of the ensemble retrieval methodology.

Filtering outliers
Before the statistical properties of the ensemble retrieval methodology can be evaluated, outliers need to be filtered from each ensemble.The DSD estimates at each altitude were visually examined for each profile to search for consistent biases in the retrievals.A typical profile is shown in Fig. 5 with the 42 DSD estimates of reflectivity, mean raindrop diameter, and rain rate for each range gate shown in three separate panels.The six Marshall-Palmer DSD Model (MP) retrievals are shown with circles and all other DSD Models are shown with pluses.Note that for this profile, the MP mean raindrop diameter D m is consistently less than the other DSD Models, and the MP rain rate R is consistently greater than the other DSD Models.These biases occurred with nearly every profile during the two rain events and the D m bias for each DSD estimate relative to the ensemble mean is shown in Table 2.The 42 DSD estimates are shown in Table 2 with the seven rows corresponding to the DSD models and the six columns corresponding to the two numerical model methods and the three cost functions.The bias for each DSD estimate is defined using all n observations from both rain events and determined using Bias= 1 where D estimate m,i is a particular DSD estimate and D m i is the ensemble mean using all 42 DSD estimates for each i th observation.The MP DSD model D m is biased low relative to the ensemble mean for all six numerical methods by at least 0.4 mm.This D m underestimate leads to a rain rate over estimate.The D m bias indicates that the constraints of the MP DSD Model which were derived using mid-latitude stratiform rain events are not appropriate for DSD estimates for these two tropical rain events observed near Darwin, Australia.Due to the inconsistency of the MP retrievals, the six MP DSD Models were eliminated from the ensemble  database, leaving a maximum number of 36 members in each ensemble.
It is understood that with the ensemble modeling paradigm, not all models produce realistic results for every situation.Therefore, the remaining 36 mean raindrop diameter D m and 36 rain rate R estimates for each ensemble were screened for outliers using a two-step filter.First, the median and standard deviation of D m and R ( Dm , R, σ {D m }, and σ {R}) were estimated for each ensemble of 36 DSD estimates.The second step removed all DSD estimates that were either outside the bounds of Dm ±2σ {D m } or R ± 2σ {R}, or greater than R+2 R.After this two-step filter, all ensembles with less than 28 members were eliminated from the database, leaving a total of 26 725 independent ensembles each with at least 28 DSD estimates.

Statistical measures of the ensembles
After each ensemble was filtered to remove outlier DSD estimates, the mean and standard deviation of D m and R were estimated for each ensemble ( D m , σ {D m }, R , and σ {R}).The top panel of Fig. 6 shows the histogram of D m and the percent accumulation for all 26 725 ensembles.Vi-sual inspection suggests that the D m histogram is quasisymmetric and the uniform distribution of the 10, 50 and 90 percentiles with values of 0.9, 1.5, and 2.1 mm supports a quasi-symmetric histogram.The bottom panel of Fig. 6 shows the histogram and percent accumulation of σ {D m } for a sub-set of 2474 ensembles that have D m between 1.45 and 1.55 mm.While visually, the σ {D m } histogram appears asymmetric with more larger values than smaller values, quantitatively, the 10th, 50th, and 90th percentile values are nearly uniformly distributed with values of 0.05, 0.10, and 0.14 mm, suggesting a quasi-symmetric distribution.
The histogram and percent accumulation of R for all 26 725 ensembles are shown in the top panel of Fig. 7.Both visual inspection and the non-uniform spacing between the 10th, 50th, and 90th percentiles of R indicate that the rain rate histogram is asymmetrical.The bottom panel in Fig. 7 shows the histogram and percent accumulation of σ {R} for the small interval of R between 1.45 and 1.55 mm h −1 .The strong asymmetry of the σ {R} histogram is identified both visually and by the non-uniform percentiles (0.04, 0.08, and 0.32 mm h −1 ).This implies that Gaussian statistics, which include estimates of the mean and standard deviation, cannot be used to describe the distribution of σ {R} for this small interval of R .Therefore, the uncertainty of R is quantified using the 10th, 50th, and 90th percentiles of σ {R} for small intervals of R .For consistency in the analysis, uncertainties in D m will also be quantified using the 10th, 50th, and 90th percentiles of σ {D m } for each small interval of D m , even though the frequency distributions are quasi-Gaussian in shape.

Uncertainties for small ranges of D m and R
Due to the non-linear and non-Gaussian distribution of ensemble statistics discussed in the previous section, estimating the uncertainty in D m and R requires estimating the 10th, 50th, and 90th percentiles of σ {D m } and σ {R} for small intervals of D m and R .In particular, for each small interval of D m or R , the corresponding population of σ {D m } and σ {R} are sorted to estimate the 10th, 50th, and 90th percentiles of σ {D m } and σ {R}. Figure 8 shows the 10th-to-90th percentile ranges plus the 50th percentile value for D m ranging from 0.4 to 2.7 mm in 0.1 mm intervals (top panel) and for R ranging from 0.0 to 2.3 mm h −1 in 0.1 mm h −1 intervals (bottom panel).The 50th percentile is shown for each D m and R with the horizontal bar in both panels.
The smallest D m uncertainty occurs for D m between 0.8 and 1.8 mm and the 90th percentile is less than 0.15 mm and the median value is less than 0.11 mm.The uncertainty increases for small and larger D m values, which is consistent with the simulations performed by Schafer et al. (2002).The uncertainty of R increases with R .While the 90th percentile uncertainty approaches 0.6 mm h −1 for 2 mm h −1 rain rate (30% relative error), the median uncertainty is less than 0.15 mm h −1 at this rain rate (less than 8% relative error).The non-uniform spacing between the rain rate 10th, 50th and 90th percentiles highlights the non-linear rain rate error between the different retrieval methodologies.

Conclusions
Before radar estimates of the raindrop size distribution (DSD) can be assimilated into numerical weather prediction models, the retrieved DSD must include both the estimated precipitation parameter (i.e., reflectivity, mean massweighted diameter, rain rate) and an estimate of the uncertainty.The ensemble methodology enables the DSD uncertainty to be estimated by measuring the spread in DSD retrievals that use the same observations as inputs but use different retrieval methodologies to estimate the DSD.The DSD retrieval methodologies are dependent on how the DSD is modeled, how the numerical inversion method is implemented, and how the cost function is defined to determine the "best" solution.
In this study, seven different DSD models were used to mathematically describe the raindrop size distribution and included a Gamma distribution, an exponential distribution, a Marshall-Palmer distribution, constant µ=2.5 and µ=5 Gamma distributions, a Gamma distribution constrained using a µ−D m relationship, and a log-normal distribution.The convolution method and deconvolution method were the two numerical inversion methods used in this study.And, three cost functions were used to compare the observations and models and included two point-by-point cost functions and one moment cost function.For each set of radar observations, 42 different DSD estimates were generated to form one ensemble.The retrieved DSDs were parameterized by the reflectivity, mass-weighted mean diameter D m , and rain rate R.
By comparing the 42 DSD parameters of reflectivity, D m , and R at each range gate for every profile during two rain events, it was determined that the Marshall-Palmer (MP) DSD model produced DSD parameters with D m too small and R too large compared with the other DSD models.Note that the MP DSD is a special case of an exponential shaped DSD with N 0 fixed to a value and the slope dependent on the rainrate (see Eq. 7 and Marshall and Palmer, 1948).These biases were observed for nearly every profile and were independent of range gate suggesting that the MP DSD model, which was developed using mid-latitude stratiform rain events, was not appropriate for these tropical rain events.Therefore, the MP DSD model runs were eliminated from the ensemble database.After removing outliers from individual ensembles, 26 725 ensembles were used in this study with at least 28 DSD estimates in each independent ensemble.
In order to estimate the uncertainty of the D m and R estimates, small intervals of mean D m and mean R ( D m and R ) were identified, and the spread in the corresponding σ {D m } and σ {R} were studied in detail.The histograms of σ {D m } and σ {R} were not symmetric which prevented the use of Gaussian statistics (estimates of mean and standard deviation) to describe the histograms and to describe the uncertainties of D m and R. Therefore, the 10th, 50th, and 90th percentiles of σ {D m } and σ {R} were used to describe the uncertainty of D m and R for small intervals of D m and R.
The smallest D m uncertainty occurs for D m between 0.8 and 1.8 mm and the 90th percentile is less than 0.15 mm which corresponds to a relative error of less than 20%.The median value of σ {D m } was less than 0.11 mm and corre-sponds to a relative error of less than 15%.The uncertainty increases for smaller and larger D m values.The uncertainty of R increases with R .While the 90 th percentile uncertainty approaches 0.6 mm h −1 for 2 mm h −1 rain rate (30% relative error), the median uncertainty is less than 0.15 mm h −1 at this rain rate (less than 8% relative error).

Fig. 3 .
Fig. 3. Time-height cross section of radar reflectivity from the vertically pointing 920-MHz profiler for (a) 22 January and (b) 23 January 2006 during TWPICE.The lines at 1.5 and 4.0 km indicate the altitude range of DSD retrievals used in this study.

Fig. 4 .
Fig. 4. Simultaneous vertical profiles of reflectivity-weighted Doppler velocity spectral density [units of dBZ (m s −1 ) −1 ] on 23 January 2006 at 20:04 UTC for (a) the 50-MHz profiler and (b) the 920-MHz profiler.The ambient air motion is estimated from the Bragg scattering component in the 50-MHz profiler spectra (indicated in panel a).The first and second moments of the Bragg scattering component are indicated with asterisks and horizontal lines (ω Dopper ±σ air ) on the 920-MHz profiler spectra in (b).The DSD is estimated from the Rayleigh scattering component of the 920-MHz profiler spectra shown in (b).The solid black lines labeled "1 mm", "3 mm", and "6 mm" indicate the air density adjusted terminal fall speeds of raindrops with diameters of 1, 3, and 6 mm, respectively.

)6
Radar observations during TWPICEVertically pointing profiling radar observations were collected in January and February 2006 during the Tropical Warm Pool -International Cloud Experiment (TWP-ICE) around Darwin, Australia.The experiment provided both remote sensing observations and aircraft in-situ measurements within anvil clouds which are needed to verify the microphysical properties inferred by ground-based remote sensing instruments.For this study, the Doppler velocity spectra collected by the collocated 50-MHz and 920-MHz profiling radars were used to estimate the vertical air motion and the vertical profile of rain drop size distributions (DSDs).The long wavelength 50-MHz profiler observations are used to estimate the vertical Doppler motion ω Doppler and the turbulent broadening σ air as the precipitation passed directly over the profiler site.The shorter wavelength 920-MHz profiler observations provided the observed reflectivityweighted Doppler velocity spectra S obs (v) used to estimate the DSD.

Fig. 5 .
Fig. 5. Retrieved (a) reflectivity, (b) mean diameter D m , and (c) rain rate for each of the 42 DSD retrievals and at each of the 25 range gates for 22 January 2006 at 13:55 UTC.The Marshall-Palmer (MP) DSD model retrievals are shown with circles and all other DSD models are shown with pluses.The MP solutions are over-constrained for this data set and produce D m and R that are not consistent with the other retrievals.MP DSD models are not used in any of the ensemble statistics.

Fig. 6 .
Fig. 6.Panel (a) shows the histogram of D m occurrence for all 26 725 ensembles as a function of retrieved D m (solid line) and the percent accumulation from 0 to 100% (dashed line).Panel (b) shows the histogram of σ {D m } for the small range of 1.45< D m <1.55 mm.Panel (b) also shows the percent accumulation from 0 to 100% of these 2474 ensembles in this sub-set.The values of the 10th, 50th, and 90th percentiles are indicated in the two panels.

Fig. 7 .
Fig. 7. Panel (a) shows the histogram of R occurrence for all 26 725 ensembles as a function of retrieved R (solid line) and the percent accumulation from 0 to 100% (dashed line).Panel (b) shows the histogram of σ {R} for the small range of 1.45< R <1.55 mm h −1 .Panel (b) also shows the percent accumulation from 0 to 100% of these 562 ensembles in this sub-set.The values of the 10th, 50th, and 90th percentiles are indicated in the two panels.

Fig. 8 .
Fig. 8. Panel (a) shows the 10th-to-90th percentile ranges of σ {D m } for 0.1 mm intervals of D m .The median value for each D m interval is shown with a horizontal line.Panel (b) shows the 10th-to-90th percentile ranges of σ {R} for 0.1 mm h −1 intervals of R along with the median value shown with a horizontal line.

Table 2 .
Bias of D m for each DSD estimate relative to the ensemble mean (seeEq.22).Each row corresponds to a DSD model (described in Sect.2) with the model equation shown in the parentheses.Each column corresponds to a numerical inverse method (described in Sect.4) and a cost function (described in Sect.5).