Introduction

ANGEO

Annales Geophysicae

ANGEO

Ann. Geophys.

1432-0576

Copernicus Publications

Göttingen, Germany

10.5194/angeo-35-691-2017

Errors in wind resource and energy yield assessments based on the Weibull distribution

Jourdier

Bénédicte

Drobinski

Philippe

philippe.drobinski@lmd.polytechnique.fr 1LMD/IPSL, École polytechnique, Université Paris Saclay, ENS, PSL Research University, Sorbonne Universités, UPMC Univ Paris 06, CNRS, Palaiseau, France 2French Environment and Energy Management Agency (ADEME), Angers, France 3Now at: EDF R&D – MFFE, Applied Meteorology Group, Chatou, France

Philippe Drobinski (philippe.drobinski@lmd.polytechnique.fr)

31May2017

35 3 691700 13February2017 6April2017 13April2017

This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

This article is available from https://angeo.copernicus.org/articles/35/691/2017/angeo-35-691-2017.html

The full text article is available as a PDF file from https://angeo.copernicus.org/articles/35/691/2017/angeo-35-691-2017.pdf

The methodology used in wind resource assessments often relies on modeling the wind-speed statistics using a Weibull distribution. In spite of its common use, this distribution has been shown to not always accurately model real wind-speed distributions. Very few studies have examined the arising errors in power outputs, using either observed power productions or theoretical power curves. This article focuses on France, using surface wind measurements at 89 locations covering all regions of the country. It investigates how statistical modeling using a Weibull distribution impacts the prediction of the wind energy content and of the power output in the context of an annual energy production assessment. For this purpose it uses a plausible power curve adapted to each location. Three common methods for fitting the Weibull distribution are tested (maximum likelihood, first and third moments, and the Wind Atlas Analysis and Application Program (WAsP) method). The first two methods generate large errors in the production (mean absolute error around 5 %), especially in the southern areas where the goodness of fit of the Weibull distribution is poorer. The production is mainly overestimated except at some locations with bimodal wind distributions. With the third method, the errors are much lower at most locations (mean absolute error around 2 %). Another distribution, a mixed Rayleigh–Rice distribution, is also tested and shows better skill at assessing the wind energy yield.

Meteorology and atmospheric dynamics (mesoscale meteorology)

Introduction

France has one of the largest wind energy potentials in Europe , but only the fourth largest installed capacity behind Germany, Spain and the UK. Despite the governmental targets of 19 GW onshore and 6 GW offshore installed capacity by 2020, the total installed capacity was only 9.1 GW at the beginning of 2015. The windiest parts of France, where most of the present wind farms are located, are the northwestern and southeastern regions. The northwestern region, along the coastlines of the English Channel, is located in the storm track so that there are often strong winds coming from the Atlantic Ocean. The southeastern region is located near the Mediterranean Sea and the valleys of this very mountainous region channel the wind flows so that there are often strong and persistent winds. Today, more wind farms are installed in other, less windy areas in the northeastern and central parts of France. In these regions the mean capacity factor was below 20 % in 2014 whereas it was on average between 25 and 29 % in the windiest regions .

As the wind industry developed, it was observed that wind farms often produced less than expected, thus jeopardizing the projects' profitability and undermining the whole industry. This led to concerns about the overprediction of energy production. Aside from the errors coming from turbine performance or availability, and from the natural variability in wind, questions were raised about the methodology used to evaluate the wind energy yield. One point of the methodology, questioned in this study, is the common use of the Weibull distribution to model wind-speed statistics instead of directly using the wind-speed series measured at the investigated location. The Weibull distribution has become a widely used standard in wind energy application due to its simplicity. It depends on two easily estimated parameters and there are simple analytic expressions for the moments. It is a reference used in wind energy softwares such as the Wind Atlas Analysis and Application Program (WAsP), and it is included in regulations such as the IEC 61400-12 on wind turbine power performance testing. Many other aspects of the resource assessment methodology can lead to bad estimates, such as the wind measurement accuracy, the vertical extrapolation of wind measurements, their temporal extrapolation with measure–correlate–predict techniques, the wind flow and wake modeling or the use of inaccurate power curves. Each issue is complex enough to be the subject of many dedicated research articles, but this paper is aimed at evaluating the impact of the very step of wind-speed statistic modeling that uses a Weibull distribution.

Very few studies compared the production of real wind farms to the estimated production using either the series of wind measurements (called the chronological method) or the wind statistics from a Weibull distribution fit to that series (probabilistic method). Despite the widespread use of the Weibull distribution, it has been shown to not accurately model all kinds of observed distributions see, e.g., and many other distributions have been proposed to model the wind-speed statistics, especially mixed distributions. For a review, see, e.g., and references therein. However, in most references, the influence of modeling the wind-speed statistics using a Weibull or another distribution is assessed only for the wind energy content (i.e., the cubed wind speed). Conversely, it has rarely been addressed in terms of power output, whereas the results are expected to be drastically different due to the nonlinearity of the power curve. To the authors' knowledge, the quantification of errors in the power output was addressed in only five articles described hereafter and always for a very limited number of locations (five, one, five and one, respectively) except for the last one (178 buoys).

studied five wind farms in northeastern Spain (in the Pyrenees Mountains). They compared the real and estimated monthly energy production (MEP). They find slight underestimations from the chronological approach and more underestimation from the probabilistic approach. The errors due to the use of the Weibull distribution are important at two stations where, from what appears in the wind histograms at least, some of the distributions are bimodal. When the Weibull distribution does not introduce too many errors, the authors underline that it does not come from a good Weibull fit but from cancelations of under- and overestimation of the production from the lower, intermediate and upper parts of the wind distribution. A similar but less complete study is that of over 48 months at one site in Taiwan. They find underprediction using the chronological method, except in the low wind months, and 5 % more underestimation from the probabilistic method with the Weibull distribution. However, there are no indications of the shape of the wind-speed statistics. There are also some theoretical studies, using wind measurements and theoretical power curves, without any real production data. studied five locations and found errors ranging from -9 to +7 % in the monthly production using the Weibull distribution. He notes that there are mostly underestimations at the two locations with the lowest wind levels. studied one site in Mexico, in a mountainous area where the wind distribution is bimodal. They found that the Weibull distribution underpredicted the energy production by 14 % and used instead a mixture of two Weibull distributions to better represent the wind statistics. studied several statistical distributions with data from 16 locations in the Canary Islands. There are no specific results for the Weibull distribution but the interesting result is that the relative errors in power production decreased as the goodness of fit between the distributions and the observations improved. studied 178 buoys located around North America, where the wind speeds were measured at either 5 or 10 m above sea level. They tested the two-parameter Weibull distribution as well as 13 other distributions. The error in power output was estimated using the theoretical power curve of a Vestas V47/660 wind turbine. For the Weibull distribution, the relative errors ranged from -10 to 7 %, while more complex distributions (four-parameter Kappa, five-parameter Wakeby) gave much better results. This is an interesting result even if the power curve is not really suited: indeed, this turbine is supposed to have a hub height of at least 40 m, whereas the measurements are made at less than 10 m. Therefore, it probably puts too much weight on the high wind speeds.

The present study investigates the errors made in evaluating the annual energy production when assuming that the wind-speed statistics follow a Weibull distribution. It compares different ways of fitting the Weibull distribution, which is essential because we show how the different fitting methods lead to very different results. There are many articles comparing different fitting methods for the Weibull distribution but none of them quantify the arising errors in the power output. Moreover, we include the WAsP fitting method, which is by far the most used in the industry but almost never studied in articles. To our knowledge, this method is only referred to in and for a completely different subject of application. As a comparison to the Weibull, we also consider a more complex mixed distribution, the Rayleigh–Rice distribution suggested in . It has been chosen instead of a mixture of Weibull distributions e.g.,, as report a better description of the tails of the distributions by the Rice-like distribution.

A limit of most of the studies evaluating the Weibull and other distributions for wind energy applications is that they stop at the computation of energy or use inappropriate power curves. Since the relation between the available energy in the wind and the actual production of a wind turbine is not at all linear, a good fit for the energy does not guarantee a good estimation of production. To address this issue, this paper develops a methodology to compute the production at any location using a realistic power curve even when using surface measurements.

Another limit of most of the studies cited above is that they study very small numbers of locations. To address this issue, the present study is based on a large wind dataset of 89 weather stations, covering different sub-climatic regions of France. This enables the discovery of some systematic behaviors: systematic over- or underestimations depending on the wind characteristics at the location and the fitting method. It enables emphasis on the link between the goodness of fit of the distribution and the production estimate errors thus completing the work of.

Section presents the data and the methodology used to fit the distributions and evaluate the errors in energy content and production arising from the statistical modeling. Section presents the resulting errors at all stations and Sect. discusses these results. Finally, Sect. concludes the study.

Material and methods Wind-speed data

In this article we use wind surface measurements (10 m a.g.l) from the global NOAA ISD Lite database , already used in and . This compilation of observations from operational weather stations is the best publicly available dataset over the considered region.

We use 4 years of measurements made between 1 January 2010 and 31 December 2013. The 10 min averaged wind speeds are recorded every hour. We select the stations located in France that present a data availability greater than 97 % over the 4 years, with a minimum of 85 % for each month to ensure a good representation of the seasonal cycle. We also keep only the stations with the best precision in the measurements. We therefore keep 89 stations.

Calm winds correspond to wind speeds equal to zero in the dataset. They represent 5.2 % of the entire dataset, with disparities among the stations of course. The calm winds are removed before fitting the distributions since they are not taken into account in the Weibull distribution. The wind speeds are binned with intervals of about 0.514 m s-1 because the data were previously recorded with bins of 1 knot. We add a small random noise to the wind-speed data in order to remove the effect of this sampling. The added noise is a continuous uniform distribution between -0.5 and +0.5 knot (arising negative values are set to zero). We also tested a Gaussian noise with a standard deviation of 1/3 knot, which led to the exact same results.

Throughout the article the wind speed is noted as w and the series of hourly observations is noted as (wi)i=1n, where n is the number of observations, which would be 35 064 for a complete set over the years 2010 to 2013, but is fewer due to the missing values and removed calms. In the following, the results are given for the 4 years of data but remain similar when limiting the data to 1 year.

Wind-speed statistic models

We use two distributions for the wind speed: the commonly used Weibull distribution and the Rayleigh–Rice distribution, defined in . We note f their probability density functions (PDF) and F their cumulative distribution function (CDF). We explain here the different fitting methods for each distribution.

Weibull distribution

The Weibull distribution depends on the scale parameter A>0 and the shape parameter k>0. Its PDF and CDF expressions are fwbl(w;A,k)=kAwAk-1exp⁡-wAkFwbl(w;A,k)=1-exp⁡-wAk.

The Weibull distribution has simple expressions for its moments, such as the average wind w‾ and the energy content w3‾: w‾=AΓ1+1kw3‾=A3Γ1+3k, where Γ is the gamma distribution defined by Γ(x)=∫0∞e-ttx-1dt.

There are many ways of fitting the Weibull distribution to a set of observations; see for example or for extensive comparison of some of the methods. In this article we compare the three methods that are expected to be the most used in the wind industry:

the maximum likelihood estimation (MLE), which maximizes the log likelihood function ;

the method of moments using the first and third moments (M1 & M3), which solves the set of Eqs. () and ();

the method used in WAsP using the third moment and the probability of winds above the empirical mean wind speed . In WAsP, the data are divided into several direction sectors and one distribution is fit for each sector. This is not the case here; there is no division according to the wind direction.

In each case, the first step is to iteratively solve a nonlinear equation for k (the shape parameter) and, once k is known, to compute the value for A (the shape parameter) from a simple relation. Table gives those equations, with the following notations for the observed mean, w^=1n∑i=1nwi ; observed energy content (third moment), w3^=1n∑i=1nwi3 ; observed raw moment of order k, wk^=1n∑i=1nwik ; observed probability of winds above the mean, p^=1n∑i=1n1{wi>w^}.

Sets of equations used to fit the Weibull distribution for each method (see text for notations).

Method Nonlinear equation to solve for k Equation for A MLE

∑i=1nwikln⁡(wi)wk^-∑i=1nln⁡(wi)-nk=0

A=wk^1/k

M1 & M3

w^3Γ1+3k-w3^Γ31+1k=0

A=w3^Γ1+3k1/3

WAsP

ln⁡-ln⁡p^-kln⁡(w^)-13ln⁡(w3^)+13Γ1+3k=0

Rayleigh–Rice distribution

The Rayleigh–Rice distribution is a mixture of a Rayleigh distribution (parameter σ12) and a Rice distribution (parameters μ≥0 and σ22) weighted by a parameter α (0≤α≤1). Its PDF expression is frr(w;α,σ12,μ,σ22)=αwσ22exp⁡-w2+μ22σ22I0wμσ22+(1-α)wσ12exp⁡-w22σ12, where I0 is the modified Bessel function of the first kind and zero order. There is no simple analytic expression for the CDF; therefore, Frr is computed by numerical integration of frr for a given set of parameters.

The Rayleigh–Rice distribution is fitted as in by minimizing the right-tail Anderson–Darling statistics (Rn2) (defined in ), calculated by Rn2=n2+2∑i=1nzi-1n∑i=1n(2i-1)ln⁡(1-zn+1-i), where we note zi=F(wi) for the series of observations (wi)i=1n sorted so that w1≤…≤wn.

The minimization of Rn2 is solved by a Nelder–Mead algorithm to find the best parameters. It is a little difficult to converge because there are four parameters, among which the α parameter has a nonlinear effect. To overcome this, we first fit the distribution for only three parameters and a fixed value of α, repeat this for a series of different α values and choose the best of all fits. This best fit is then used as a first estimate to fit with four parameters and it converges rapidly.

Energy estimation

We study the available energy content E, i.e., the cube of the wind speed, and the production P, i.e., the energy yield from a wind turbine. E and P are computed for each distribution using their density function (probabilistic method) and compared to the reference value based on the series of observations without any statistical modeling (chronological method).

Power curve

In order to compute the power production, we use a power curve derived from a Vestas V90/2000 wind turbine, which has a 90 m diameter rotor and 2 MW nominal power. This model is one of the most common turbines in France as well as the rest of the world. The problem is that the hub height of such a wind turbine is typically 100 m whereas we use wind measurements at 10 m. Moreover, using a single power curve for all the stations is not possible because the stations have very different average winds; this would lead to an unrealistically large production at some stations and almost no production at others.

Therefore, we use a flexible power curve Pa(w) depending on a parameter a to adapt to the wind characteristics at each station. The initial power curve is transformed linearly so that it is equivalent to multiplying the wind speeds by the value of a. This can be seen as a vertical extrapolation of the surface wind. For example, a value a=1.35 corresponds to the coefficient that would be used in an extrapolation of the 10 m wind speeds to the altitude of 85 m using the one-seventh power law. As we normalize the power curve, it could also be seen as using a smaller wind turbine adapted to lower winds.

The initial V90/2000 power curve is drawn in Fig. (in red) as well as a modified one with parameter a=1.35 (in dashed blue). At each station, the a parameter is adjusted so that the capacity factor of the turbine reaches 30 %. The real production would be a little lower since we removed the calm winds, and because we use an ideal power curve and do not account for any losses in the production process.

Example of the power curves used to compute the power production P. In red (P1) the power curve of a Vestas V90/2000 wind turbine normalized by its rated power (2 MW) is shown. In dashed blue (P1.35) the power curve adapted from the previous one by a linear transformation of factor a=1.35 is shown.

Maps of ΔE, i.e., the relative errors in wind energy content due to the statistical modeling, at the 89 stations for either the Weibull distribution fit by maximum likelihood (a) or the Rayleigh–Rice distribution (b).

Energy

The reference energy computed from the series of observations (wi)i=1n is Eref=1n∑i=0nwi3.

The energy computed from f, the PDF of a distribution fitted to the observations, is E=∫0∞f(w)w3dw.

The error in energy of the probabilistic method (Eq. ) relative to the chronological method (Eq. ) is ΔE=E-E refE ref.

Maps of ΔP, i.e., the relative errors in wind power production due to the statistical modeling, at the 89 stations for either the Weibull distribution fit by the maximum likelihood method (a), the first and third moments method (b), and the WAsP method (c), or the Rayleigh–Rice distribution (d).

Production

Based on the power curve Pa, the reference production is Pref=1n∑i=0nPa(wi). This is the mean power output in watts but since we normalized Pa by its nominal power, Eq. () actually gives the mean capacity factor. The a parameter of the power curve is adjusted so that the capacity factor reaches 30 %. This value of a is then used to compute P from the four distributions.

The average power computed from the probabilistic method from the PDF f is P=∫0∞f(w)Pa(w)dw.

The error in production of the probabilistic method relative to the chronological method is ΔP=P-P refPref.

Results

The errors in the energy assessment are computed for four distributions: the Weibull distribution fitted by the three different methods and the Rayleigh–Rice distribution.

Energy

In terms of energy content E, both the method of moments and the WAsP method for fitting the Weibull distribution make no error since the energy content is the third moment w3‾ and it is fixed to the observed energy content when solving Eq. () to find the A and k parameters. For the other distributions, the results are shown in Fig. . With the Weibull distribution fitted by maximum likelihood (Fig. a), the errors are low in the northeastern region but much larger in the southern region. The absolute errors are above 5 % at 32 stations and above 10 % at 10 stations (the maximum being 29 %). The energy is almost always underestimated, except in some places in the valleys of the southeastern region. With the Rayleigh–Rice distribution (Fig. b), the errors are in general very low, with some exceptions. The absolute errors are below 3 % at 80 % of the stations; only 10 stations are above 5 % (including 2 above 10 %).

Representation of the energy and production calculations at Melun station (2010–2013). (a) Histogram of the observed wind-speed series and probability density function f(w) of the fitted distributions (Weibull fit by MLE, moments, or WAsP methods and Rayleigh–Rice fit). (b) Wind energy content as a function of wind speed w (i.e., f(w)w3) for each distribution and associated histogram for the observations. (c) Power curve Pa adapted to the station in order to have a capacity factor of 30 %. (d) Wind power output as a function of wind speed (i.e., f(w)Pa(w)) for each distribution and associated histogram for the observations.

Production

When it comes to the power output P, errors arise for all four cases, even when there was no error in the energy content. Indeed, the fact that ΔE=0 does not mean that the observed distribution is well fitted by the Weibull distribution. There may be positive and negative errors balancing one another when integrating over the whole distribution. Since the power curve is not a linear function of w3, these errors do not balance anymore in the power calculations and this may lead to large values of ΔP.

For the Weibull distribution fitted by maximum likelihood (Fig. a), ΔP is of the same order of magnitude but of opposite sign as ΔE (Fig. a). The mean absolute error (MAE) is 5.2 %, two-thirds of the stations have an absolute error above 3 % and the maximum error is 32 %. With the method of moments (Fig. b), the errors are similar to those for MLE; the spatial pattern is the same, the values are just slightly lower (MAE of 4.3 %, maximum error of 17 %). Conversely, with the WAsP method the errors are small, the MAE is 1.7 %, only 13 stations have an absolute error above 3 % and the maximum error is 9 %. Finally, with the Rayleigh–Rice distribution, we find small errors everywhere, with a slight bias towards overestimation: the errors range from 0.2 to 3.1 %, with an average of 1.4 %.

Sensitivity of the results

The figures above are given for the whole dataset (years 2010 to 2013). The results are similar when limiting the wind series to only 1 year, with only small differences due to the interannual wind variability. For the computation of the production, the power curve has an important role. We tested other shapes, among which were simpler power curves with a ramp between cut-in and rated wind speed as a linear function of either w or w3. They all led to similar results. Choosing a different capacity factor, such as 25 or 35 %, to adjust the a parameter of the power curve has also a small impact. The errors tend to be smaller when using a larger capacity factor, except in the case of the MLE method where the errors tend to increase. When the capacity factors are much lower, the errors of the Weibull MLE switch to negative values. This could represent what happens in seasons with low wind. This is consistent with the findings of , where underprediction of the production mainly appears in the least windy locations and months.

We also tested the sensitivity of the results to the sampling of the wind data. We added a small random noise to the wind data because the speeds were binned with a 1-knot interval. When fitting the distributions to the raw sampled data instead of the smoothed data, the results are very similar except for the WAsP method. In that case, fitting the Weibull to the raw data leads to large negative or positive errors (MAE around 5 %), without any spatial coherence. It is unclear why.

Discussion Examples at two stations

To better understand these results, we first focus on two example stations: Melun, which is located in northern central France, near Paris (48.6∘ N, 2.7∘ E), and Orange, which is located in the Rhône River valley between the Alps and the Massif Central Mountains, close to the Mediterranean Sea (44.1∘ N, 4.8∘ E). The wind-speed histograms at these two stations are shown in Figs. a and a, respectively, as well as the PDF of the Weibull distributions (fitted by the three different methods) and of the Rayleigh–Rice distribution. At Melun the histogram is more peaked than can be modeled by a Weibull distribution. This is a common behavior at most stations, which is very pronounced at some locations in southern France. At Orange, the distribution is bimodal; there are two peaks and this cannot be modeled accurately by the unimodal Weibull distribution, but can be modeled by the more flexible Rayleigh–Rice distribution. This type of bimodal distribution is found at several locations in the southern valleys of France.

Representation of the energy and production calculations at Orange, representative of a bimodal case. Same description as Fig. .

The computation of energy is very sensitive to the adjustment of the right tail of the distribution since the very high winds, once cubed, have an important weight despite their low frequency. The Weibull distribution, especially the maximum likelihood fit, tends to underestimate the frequency of these very high winds and therefore underestimate E. This phenomenon is visible in Fig. b for Melun in the range 9–15 m s-1. Most stations, especially in the southern part of France, present such an underestimation of the very high winds by the Weibull distribution, with different magnitudes.

At some other locations, such as at Orange, the wind-speed distribution has two peaks. In that case, the Weibull distribution, which cannot model two peaks, passes through both: it underestimates the wind frequency at the two peaks but overestimates the winds in between the two peaks and the very high winds beyond the second peak. At Orange, we can see that the Weibull distribution, whatever the fitting method, overestimates the probability of winds above 15 m s-1 (Fig. a) and therefore their contribution to the energy (Fig. b). With the moments and WAsP methods, this overestimation is smaller and balanced by the underestimation in the range 8–15 m s-1 (corresponding to the second peak). In the case of the MLE method, it is not completely balanced and it leads to an overestimation of the energy by more than 10 %.

When it comes to the estimation of the energy yield from a wind turbine, the very high winds are not so important since the power output is constant between the rated wind speed and the cutout wind speed of the wind turbine. The power curves used at Melun and Orange are drawn in Figs. c and c, respectively, and the power output is shown in Figs. d and d. Nevertheless, an underestimation of the very high winds is associated with an overestimation of winds around the rated wind speed, which have the largest contribution to the production. This is why we get opposite errors in E and P with the Weibull fit by MLE. With the WAsP method, the Weibull fit is better adjusted to the high winds: we can see clearly in Fig. b that there is less underestimation for the winds above 8 m s-1 and less overestimation for the winds below 8 m s-1 than with the two other methods. Therefore, the errors in P are much reduced in most cases.

In the bimodal cases, the overestimation of the very high winds is associated with an underestimation of middle-high winds and therefore an underestimation of the energy yield. This is particularly critical at Orange, where the winds with the largest weight in the production are exactly around the second peak, largely underestimated by the Weibull distribution. Finding an underestimation of the production in a bimodal case is consistent with the literature, such as the Mexican case of , and at least some of the largest underestimations found in (the wind histograms are only shown for some cases).

Importance of the goodness of fit between the observed and theoretical distributions

The fact that modeling the wind-speed statistics using the Weibull distribution introduces errors in the energy estimate can be related to the poor fit of this distribution to the tail of the observations, i.e., the high wind speeds, which contribute a lot to the energy. It was shown in that the Weibull distribution does not fit the tail well. Conversely, the Rayleigh–Rice distribution was shown to have good agreements on the tail. Indeed, here the few locations where we find large errors ΔE with the Rayleigh–Rice distribution (see Fig. b) are the stations where the very high winds are not well fitted by the Rayleigh–Rice.

The errors in the production are also related to the goodness of fit between the distributions and the observations, which can be measured for example by the right-tail Anderson–Darling statistics (Rn2, Eq. ). Figure shows ΔP (in absolute value) as a function of Rn2. With the Weibull distribution fitted by maximum likelihood (squares) and by the method of moments (circles), |ΔP| values are highly correlated with Rn2 (Pearson correlation coefficient of 0.9). This can be linked with , who find that the error in the production decreases when the quality of fit of the distributions increases.

With the WAsP method (diamonds), the relation is weaker (correlation of 0.36) because this method does not necessarily give a very good fit to the whole distribution (thus Rn2 values may be higher than for the MLE method) but favors a better fit to the range of winds that are important for the energy production (thus ΔP values are lowered).

Relative error in the wind power production, ΔP (in absolute value) as a function of the right-tail Anderson–Darling score, Rn2, used as a goodness-of-fit estimate for the Weibull distribution. The distribution is fit by the maximum likelihood (squares), moments (circles) and WAsP (diamonds) methods at each station. Logarithmic scales.

Conclusions

In this article we investigated the errors in the wind resource assessment that could result from the use of a statistical model, especially with the commonly used Weibull distribution. We showed the importance of evaluating the errors in the production instead of the energy (i.e., the cubed wind speed), as it is mostly done in the literature. Indeed, a perfect fit to the energy does not guarantee that the distribution really fits the observed data. It may come from the cancellation of opposite errors and may lead to large errors in the production due to the nonlinear effect of the power curve. Furthermore, the energy content is not a good indicator of the goodness of fit of a distribution because it puts too much weight on the tail of the wind-speed distribution, which actually contributes very little or not at all to the energy production due to the shape of the power curves.

We found large errors in the production (MAE around 4 or 5 %) when modeling the wind-speed statistics using a Weibull distribution fit with either maximum likelihood estimation or the first and third moments method. We found lower errors with the WAsP method, which is reassuring for the wind industry since WAsP is among the most commonly used software programs in wind resource assessment. Still, even this method may lead to important errors at some locations so we advise against the use of the Weibull distribution. Apart from the WAsP method, the Weibull fits lead to an overestimation of the production at most locations in France. This bias could have contributed to the observed overestimation of the production. We also found more errors in the areas closer to high topography, where the Weibull distribution is less adapted, and also a tendency towards more errors when the capacity factors are lower. As these two conditions correspond to the new areas targeted by the wind industry in France, these are more reasons not to use the Weibull distribution in the future.

The Rayleigh–Rice distribution shows very good skill at predicting the energy production at all locations. The fit is always very close to the observations over the whole distribution. Therefore, the errors are always very small for whichever part of the wind distribution is used for the production. There is a slight overestimation, but it is not problematic since this bias is systematic and could be anticipated. However, even with such a good distribution, the very use of a statistical model is questionable. The Weibull distribution was very useful in the early age of the wind industry because it simplified the computations considerably. Today computers can handle very long wind series without any problem, limiting the need for any modeling when there are actual measurements. Indeed, the measurements are much more precise and contain much more information than any two- or four-parameter model.

This study benefits from the use of a large dataset, covering all regions of France. The drawback of this dataset is that they are surface measurements, often located in areas not actually adapted to wind project development. The results apply to all the studies using surface measurements, and could be followed by more precise studies at precise locations, using real wind project measurements at higher levels above ground. A question is whether or not the wind distribution varies a lot with the altitude and whether the shapes are closer to the Weibull distribution higher up.

Wind data and information on the Integrated Surface Database are available at http://www.ncdc.noaa.gov/isd.

The authors declare that they have no conflict of interest.

Acknowledgements

Bénédicte Jourdier was funded by the French Environment and Energy Management Agency (ADEME) and GDF Suez. This research has received funding from ADEME through the MODEOL project (contract 1205C01467). The topical editor, V. Kotroni, thanks the one anonymous referee for help in evaluating this paper.

References Baïle et al.(2011)

Baïle, R., Muzy, J.-F., and Poggi, P.: An M-Rice wind speed frequency distribution, Wind Energy, 14, 735–748, 10.1002/we.454, 2011.

Carta et al.(2008)

Carta, J. A., Ramírez, P., and Velázquez, S.: Influence of the level of fit of a density probability function to wind-speed data on the WECS mean power output estimation, Energ. Convers. Manage., 49, 2647–2655, 10.1016/j.enconman.2008.04.012, 2008.

Carta et al.(2009)

Carta, J. A., Ramírez, P., and Velázquez, S.: A review of wind speed probability distributions used in wind energy analysis: Case studies in the Canary Islands, Renewable and Sustainable Energy Reviews, 13, 933–955, 10.1016/j.rser.2008.05.005, 2009.

Celik(2003)

Celik, A. N.: Energy output estimation for small-scale wind power generators using Weibull-representative wind data, J. Wind Eng. Ind. Aerod., 91, 693–707, 10.1016/S0167-6105(02)00471-3, 2003.

Chang and Tu(2007)

Chang, T.-J. and Tu, Y.-L.: Evaluation of monthly capacity factor of WECS using chronological and probabilistic wind speed data: A case study of Taiwan, Renew. Energ., 32, 1999–2010, 10.1016/j.renene.2006.10.010, 2007.

Chang(2011)

Chang, T. P.: Performance comparison of six numerical methods in estimating Weibull parameters for wind energy application, Appli. Energ., 88, 272–282, 10.1016/j.apenergy.2010.06.018, 2011.

Cohen(1965)

Cohen, A. C.: Maximum Likelihood Estimation in the Weibull Distribution Based On Complete and On Censored Samples, Technometrics, 7, 579–588, 10.1080/00401706.1965.10490300, 1965.

Drobinski et al.(2015)

Drobinski, P., Coulais, C., and Jourdier, B.: Surface wind-speed statistics modelling: alternatives to the Weibull distribution and performance evaluation, Bound.-Lay. Meteorol., 10.1007/s10546-015-0035-7, 2015.

EEA(2009)

EEA: Europe's onshore and offshore wind energy potential, Tech. Rep. 6/2009, European Environment Agency, 2009.

García-Bustamante et al.(2008)

García-Bustamante, E., González-Rouco, J. F., Jiménez, P. A., Navarro, J., and Montávez, J. P.: The influence of the Weibull assumption in monthly wind energy estimation, Wind Energy, 11, 483–502, 10.1002/we.270, 2008.

Jaramillo and Borja(2004)

Jaramillo, O. A. and Borja, M. A.: Wind speed analysis in La Ventosa, Mexico: a bimodal probability distribution case, Renew. Energ., 29, 1613–1630, 10.1016/j.renene.2004.02.001, 2004.

Luceño(2006)

Luceño, A.: Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators, Computat. Stat. Data An., 51, 904–917, 10.1016/j.csda.2005.09.011, 2006.

Morgan et al.(2011)

Morgan, E. C., Lackner, M., Vogel, R. M., and Baise, L. G.: Probability distributions for offshore wind speeds, Energ. Convers. Manage., 52, 15–26, 10.1016/j.enconman.2010.06.015, 2011.

Mortensen et al.(1993)

Mortensen, N. G., Landberg, L., Troen, I., and Lundtang Petersen, E.: Wind Analysis and Application Program (WASP), User's Guide – Report Risø-I-666 (EN), Risŏ National Laboratory, Roskilde, 1993.

Pryor et al.(2004)

Pryor, S. C., Nielsen, M., Barthelmie, R. J., and Mann, J.: Can Satellite Sampling of Offshore Wind Speeds Realistically Represent Wind Speed Distributions? Part II: Quantifying Uncertainties Associated with Distribution Fitting Methods, J. Appl. Meteorol., 43, 739–750, 10.1175/2096.1, 2004.

RTE et al.(2015)

RTE, SER, ERDF, and ADEeF: Panorama de l'électricité renouvelable 2014, http://www.rte-france.com/sites/default/files/panorama_des_energies_renouvelables_2014.pdf (last access: 29 May 2017), 2015.

Sinclair et al.(1990)

Sinclair, C., Spurr, B., and Ahmad, M.: Modified Anderson-Darling test, Commun. Stat. A-Theor., 19, 3677–3686, 10.1080/03610929008830405, 1990.

Smith et al.(2011)

Smith, A., Lott, N., and Vose, R.: The Integrated Surface Database: Recent Developments and Partnerships, B. Am. Meteorol. Soc., 92, 704–708, 10.1175/2011BAMS3015.1, 2011.

Vautard et al.(2010)

Vautard, R., Cattiaux, J., Yiou, P., Thépaut, J.-N., and Ciais, P.: Northern Hemisphere atmospheric stilling partly attributed to an increase in surface roughness, Nat. Geosci., 3, 756–761, 10.1038/ngeo979, 2010.

</app></app-group></back> </article>