The methodology used in wind resource assessments often relies on modeling the wind-speed statistics using a Weibull distribution. In spite of its common use, this distribution has been shown to not always accurately model real wind-speed distributions. Very few studies have examined the arising errors in power outputs, using either observed power productions or theoretical power curves. This article focuses on France, using surface wind measurements at 89 locations covering all regions of the country. It investigates how statistical modeling using a Weibull distribution impacts the prediction of the wind energy content and of the power output in the context of an annual energy production assessment. For this purpose it uses a plausible power curve adapted to each location. Three common methods for fitting the Weibull distribution are tested (maximum likelihood, first and third moments, and the Wind Atlas Analysis and Application Program (WAsP) method). The first two methods generate large errors in the production (mean absolute error around 5 %), especially in the southern areas where the goodness of fit of the Weibull distribution is poorer. The production is mainly overestimated except at some locations with bimodal wind distributions. With the third method, the errors are much lower at most locations (mean absolute error around 2 %). Another distribution, a mixed Rayleigh–Rice distribution, is also tested and shows better skill at assessing the wind energy yield.

France has one of the largest wind energy potentials in Europe

As the wind industry developed, it was observed that wind farms often produced less than expected, thus jeopardizing the projects' profitability and undermining the whole industry. This led to concerns about the overprediction of energy production. Aside from the errors coming from turbine performance or availability, and from the natural variability in wind, questions were raised about the methodology used to evaluate the wind energy yield. One point of the methodology, questioned in this study, is the common use of the Weibull distribution to model wind-speed statistics instead of directly using the wind-speed series measured at the investigated location. The Weibull distribution has become a widely used standard in wind energy application due to its simplicity. It depends on two easily estimated parameters and there are simple analytic expressions for the moments. It is a reference used in wind energy softwares such as the Wind Atlas Analysis and Application Program (WAsP), and it is included in regulations such as the IEC 61400-12 on wind turbine power performance testing. Many other aspects of the resource assessment methodology can lead to bad estimates, such as the wind measurement accuracy, the vertical extrapolation of wind measurements, their temporal extrapolation with measure–correlate–predict techniques, the wind flow and wake modeling or the use of inaccurate power curves. Each issue is complex enough to be the subject of many dedicated research articles, but this paper is aimed at evaluating the impact of the very step of wind-speed statistic modeling that uses a Weibull distribution.

Very few studies compared the production of real wind farms to the estimated
production using either the series of wind measurements (called the
chronological method) or the wind statistics from a Weibull distribution fit
to that series (probabilistic method). Despite the widespread use of the Weibull
distribution, it has been shown to not accurately model all kinds of observed
distributions

The present study investigates the errors made in evaluating the annual
energy production when assuming that the wind-speed statistics follow a
Weibull distribution. It compares different ways of fitting the Weibull
distribution, which is essential because we show how the different
fitting methods lead to very different results. There are many articles
comparing different fitting methods for the Weibull distribution but none of
them quantify the arising errors in the power output. Moreover, we include the
WAsP fitting method, which is by far the most used in the industry but
almost never studied in articles. To our knowledge, this method is only
referred to in

A limit of most of the studies evaluating the Weibull and other distributions for wind energy applications is that they stop at the computation of energy or use inappropriate power curves. Since the relation between the available energy in the wind and the actual production of a wind turbine is not at all linear, a good fit for the energy does not guarantee a good estimation of production. To address this issue, this paper develops a methodology to compute the production at any location using a realistic power curve even when using surface measurements.

Another limit of most of the studies cited above is that they study very
small numbers of locations. To address this issue, the present study is based
on a large wind dataset of 89 weather stations, covering different
sub-climatic regions of France. This enables the discovery of some systematic
behaviors: systematic over- or underestimations depending on the wind
characteristics at the location and the fitting method. It enables
emphasis on the link between the goodness of fit of the distribution and the
production estimate errors

Section

In this article we use wind surface measurements (10 m a.g.l) from the
global NOAA ISD Lite database

We use 4 years of measurements made between 1 January 2010 and
31 December 2013. The 10 min averaged wind speeds are recorded every hour.
We select the stations located in France that present a data availability
greater than

Calm winds correspond to wind speeds equal to zero in the dataset. They
represent

Throughout the article the wind speed is noted as

We use two distributions for the wind speed: the commonly used Weibull
distribution and the Rayleigh–Rice distribution, defined in

The Weibull distribution depends on the scale parameter

The Weibull distribution has simple expressions for its moments, such as the
average wind

There are many ways of fitting the Weibull distribution to a set of
observations; see for example

the maximum likelihood estimation (MLE), which maximizes the log likelihood function

the method of moments using the first and third moments (

the method used in WAsP

Sets of equations used to fit the Weibull distribution for each method (see text for notations).

The Rayleigh–Rice distribution is a mixture of a Rayleigh distribution
(parameter

The Rayleigh–Rice distribution is fitted as in

The minimization of

We study the available energy content

In order to compute the power production, we use a power curve derived from a Vestas V90/2000 wind turbine, which has a 90 m diameter rotor and 2 MW nominal power. This model is one of the most common turbines in France as well as the rest of the world. The problem is that the hub height of such a wind turbine is typically 100 m whereas we use wind measurements at 10 m. Moreover, using a single power curve for all the stations is not possible because the stations have very different average winds; this would lead to an unrealistically large production at some stations and almost no production at others.

Therefore, we use a flexible power curve

The initial V90/2000 power curve is drawn in Fig.

Example of the power curves used to compute the power production

Maps of

The reference energy computed from the series of observations

The energy computed from

The error in energy of the probabilistic method (Eq.

Maps of

Based on the power curve

The average power computed from the probabilistic method from the PDF

The error in production of the probabilistic method relative to the
chronological method is

The errors in the energy assessment are computed for four distributions: the Weibull distribution fitted by the three different methods and the Rayleigh–Rice distribution.

In terms of energy content

Representation of the energy and production calculations at Melun
station (2010–2013).

When it comes to the power output

For the Weibull distribution fitted by maximum likelihood
(Fig.

The figures above are given for the whole dataset (years 2010 to 2013). The
results are similar when limiting the wind series to only 1 year, with only
small differences due to the interannual wind variability. For the
computation of the production, the power curve has an important role. We
tested other shapes, among which were simpler power curves with a ramp between
cut-in and rated wind speed as a linear function of either

We also tested the sensitivity of the results to the sampling of the wind
data. We added a small random noise to the wind data because the speeds were
binned with a 1-knot interval. When fitting the distributions to the raw
sampled data instead of the smoothed data, the results are very similar
except for the WAsP method. In that case, fitting the Weibull to the raw data
leads to large negative or positive errors (MAE around

To better understand these results, we first focus on two example stations:
Melun, which is located in northern central France, near Paris
(48.6

Representation of the energy and production calculations at Orange,
representative of a bimodal case. Same description as Fig.

The computation of energy is very sensitive to the adjustment of the right
tail of the distribution since the very high winds, once cubed, have an
important weight despite their low frequency. The Weibull distribution,
especially the maximum likelihood fit, tends to underestimate the frequency
of these very high winds and therefore underestimate

At some other locations, such as at Orange, the wind-speed distribution has
two peaks. In that case, the Weibull distribution, which cannot model two
peaks, passes through both: it underestimates the wind frequency at the two
peaks but overestimates the winds in between the two peaks and the very
high winds beyond the second peak. At Orange, we can see that the Weibull
distribution, whatever the fitting method, overestimates the probability of
winds above 15 m s

When it comes to the estimation of the energy yield from a wind turbine, the
very high winds are not so important since the power output is constant
between the rated wind speed and the cutout wind speed of the wind turbine.
The power curves used at Melun and Orange are drawn in Figs.

In the bimodal cases, the overestimation of the very high winds is
associated with an underestimation of middle-high winds and therefore an
underestimation of the energy yield. This is particularly critical at
Orange, where the winds with the largest weight in the production are
exactly around the second peak, largely underestimated by the Weibull
distribution. Finding an underestimation of the production in a bimodal case
is consistent with the literature, such as the Mexican case of

The fact that modeling the wind-speed statistics using the Weibull distribution
introduces errors in the energy estimate can be related to the poor fit of
this distribution to the tail of the observations, i.e., the high wind speeds,
which contribute a lot to the energy. It was shown in

The errors in the production are also related to the goodness of fit between
the distributions and the observations, which can be measured for example by
the right-tail Anderson–Darling statistics (

With the WAsP method (diamonds), the relation is weaker (correlation of

Relative error in the wind power production,

In this article we investigated the errors in the wind resource assessment that could result from the use of a statistical model, especially with the commonly used Weibull distribution. We showed the importance of evaluating the errors in the production instead of the energy (i.e., the cubed wind speed), as it is mostly done in the literature. Indeed, a perfect fit to the energy does not guarantee that the distribution really fits the observed data. It may come from the cancellation of opposite errors and may lead to large errors in the production due to the nonlinear effect of the power curve. Furthermore, the energy content is not a good indicator of the goodness of fit of a distribution because it puts too much weight on the tail of the wind-speed distribution, which actually contributes very little or not at all to the energy production due to the shape of the power curves.

We found large errors in the production (MAE around

The Rayleigh–Rice distribution shows very good skill at predicting the energy production at all locations. The fit is always very close to the observations over the whole distribution. Therefore, the errors are always very small for whichever part of the wind distribution is used for the production. There is a slight overestimation, but it is not problematic since this bias is systematic and could be anticipated. However, even with such a good distribution, the very use of a statistical model is questionable. The Weibull distribution was very useful in the early age of the wind industry because it simplified the computations considerably. Today computers can handle very long wind series without any problem, limiting the need for any modeling when there are actual measurements. Indeed, the measurements are much more precise and contain much more information than any two- or four-parameter model.

This study benefits from the use of a large dataset, covering all regions of France. The drawback of this dataset is that they are surface measurements, often located in areas not actually adapted to wind project development. The results apply to all the studies using surface measurements, and could be followed by more precise studies at precise locations, using real wind project measurements at higher levels above ground. A question is whether or not the wind distribution varies a lot with the altitude and whether the shapes are closer to the Weibull distribution higher up.

Wind data and
information on the Integrated Surface Database are available at

The authors declare that they have no conflict of interest.

Bénédicte Jourdier was funded by the French Environment and Energy Management Agency (ADEME) and GDF Suez. This research has received funding from ADEME through the MODEOL project (contract 1205C01467). The topical editor, V. Kotroni, thanks the one anonymous referee for help in evaluating this paper.