Deterministic nature of the underlying dynamics of surface wind fluctuations

Modelling the fluctuations of the Earth’s surface wind has a significant role in understanding the dynamics of atmosphere besides its impact on various fields ranging from agriculture to structural engineering. Most of the studies on the modelling and prediction of wind speed and power reported in the literature are based on statistical methods or the probabilistic distribution of the wind speed data. In this paper we investigate the suitability of a deterministic model to represent the wind speed fluctuations by employing tools of nonlinear dynamics. We have carried out a detailed nonlinear time series analysis of the daily mean wind speed data measured at Thiruvananthapuram (8.483 ◦ N,76.950 E) from 2000 to 2010. The results of the analysis strongly suggest that the underlying dynamics is deterministic, low-dimensional and chaotic suggesting the possibility of accurate short-term prediction. As most of the chaotic systems are confined to laboratories, this is another example of a naturally occurring time series showing chaotic behaviour.


Introduction
Surface wind plays a crucial role in climate and weather system of the Earth.It has significant impact on agriculture, navigation, structural engineering calculations and reduction of atmospheric pollution as well as the economy of the region as an alternate energy source (Martín et al., 1999;Elliott, 2004;Bantaa et al., 2011;Finzi et al., 1984).Recent surge of interest in research related to wind power is due to its potential as an alternate source of energy because of the fast depletion of natural resources of Earth.Presently, there is extensive literature on various areas related to wind energy acquisition and utilisation such as wind speed modelling and prediction, wind power production and wind resource quantification (e.g.Finzi et al., 1984;Martín et al., 1999;Elliott, 2004;Celik, 2004;von Bremen, 2007;Mabel and Fernandez, 2009;Kavasseri and Seetharaman, 2009;Bantaa et al., 2011).
Wind speed modelling and forecasting is an important aspect of wind power generation -yet one of the most difficult due to the myriads of factors affecting it -and over the years many tools have been developed for this purpose.A good number of such tools rely on statistical methods, either moving average models such as ARMA and ARIMA fitted to the time series of wind speed (Kamal and Jafri, 1997;Torresa et al., 2005;Cadenas and Rivera, 2007;Kavasseri and Seetharaman, 2009) or models based on probability distribution of wind speed (Hennessey, 1977;Celik, 2004;Mathew et al., 2011).Models based on artificial neural networks have also been developed by many authors for making short-term predictions of wind speed and generated wind power (Mohandes et al., 1998;Cadenas and Rivera, 2007;Bilgili et al., 2007;Monfared et al., 2009).
As a matter of fact, none of these forecasting methods, based on time series analysis or meteorological models, is capable of significantly reducing the prediction error compared to the elementary method of persistence (Sfetsos, 2002) and this is usually attributed to the high fluctuations and variability in wind speed.Exploring the source of these random fluctuations in wind speed, especially whether it is stochastic or deterministic, is therefore important both for understanding the nature of the dynamics and for improving the tools used for prediction.A few attempts have been made in this direction, although the focus was not on determining the nature of the dynamics, rather on comparing stochastic versus deterministic models constructed from time series of scalar measurements of wind speed.Palmer et al. (1995) have analysed several time series of wind components and X-band Doppler radar signals gathered over an area of ocean surface and have found the presence of a low-dimensional dynamical attractor in the case of time series of the horizontal wind speed as well as the vertically polarized radar reflectivity.They were also able to achieve better short-term predictions from the deterministic models than from statistical models.However, the analysis carried out on daily mean wind speed (DMWS) data by Ragwitz et al. (2000) suggests that, on average, no reduction of the prediction error can be achieved by using a nonlinear model instead of a linear stochastic model.However, they could predict intermittent gusts with significantly higher accuracy.These studies are however limited by short times series (Hirata et al., 2008).Martín et al. (1999) have analysed the wind speed data by splitting them into deterministic and stochastic components.Their analysis shows that the deterministic component has 1-year, 24-h and 12-h periods.These cycles have also been observed by other authors in surface wind studies (Brett and Tuller, 1991;Gavaldá et al., 1992).The 1-year and 24-h periods are the natural Earth cycles.The 12-h period for wind speed series is well-defined and corresponds to the daytime and nighttime maxima due to the full development of the land-sea breezes.These periodicities present in the wind speed time series clearly show presence of determinism in the data.However, it is not clear whether the apparent stochastic component is strictly stochastic or arises out of chaotic underlying dynamics.
In general, the predictability and degree of determinism of atmospheric parameters depend on the time scale considered (Palmer, 1993), and this applies in particular to the case of wind speed predictions.The various models for wind speed predictions reviewed above are mostly in time scales ranging from hours to a few days.What follows from this discussion is that wind is believed to have both deterministic and stochastic components in the time scales considered, but the manner in which these components interact is still elusive and a matter of debate.The rotation of Earth and solar heat radiation are two major causes of the surface wind in addition to the local topography.The Earth's revolution is clearly deterministic.However, many authors have argued that the solar radiations are stochastic in nature and hence the underlying dynamics of the surface wind should be governed by both deterministic as well as stochastic factors.On the other hand, our previous analysis of the data of total electron content (TEC), which is strongly influenced by the solar radiation, shows strong evidence of deterministic low-dimensional character of the underlying dynamics (George et al., 2002;Kumar et al., 2004).The surface wind speed is a similar atmospheric parameter to solar influence but its dynamics is further complicated by the local conditions such as topography.Hence, it is worth investigating whether a stochastic or a deterministic model is most suitable for the underlying dynamics of surface wind fluctuations.In this work we carry out a detailed systematic analysis of the time series of daily mean wind speed (DMWS) measured at Thiruvananthapuram, Kerala, India (8.483 • N,76.950 • E ; elevation: 64 m) using tools of nonlinear dynamics for the period from year 2000 to 2010.Note that the length of the time series is about the length of a solar cycle.The data were obtained from National Climatic Data Centre (http://www.ncdc.noaa.gov).We demonstrate, using the DMWS-data, that the dynamics of wind speed is essentially deterministic with a low-dimensional chaotic character.The chaotic behaviour is what makes the long-term predictions of wind speed erroneous, but it should be possible to obtain better short-term predictions using the deterministic model than would otherwise be made with the statistical methods.It is reported that short-term predictions of one to six hours ahead at intervals of 10 min are important in power dispatching systems (Mabel and Fernandez, 2009).
We assume that there is also a stochastic component in the data arising mainly from measurement and averaging errors.The averaging errors are a result of considering the mean wind speeds and not the actual wind speeds equidistant in time as it should be for a time series.The effects of these errors are assumed to contribute an additive noise to the data which is independent of the true deterministic dynamics of the system.Hence, the first step in our analysis is to remove the effect of this noise process using a suitable noise reduction technique to reveal the true dynamics behind the data.The denoised data still contain irregular persistent fluctuations, which upon analysis using tools of non-linear dynamics reveals many attributes of a chaotic system with a low-dimensional attractor.Since some of these attributes may also be found in linear stochastic processes, we further subject the denoised data to a detailed surrogate analysis to confirm that the underlying dynamics is indeed deterministic and could not be described by a linear Gaussian stochastic model.Most of these analyses were carried out using tools implemented in the TISEAN package (Hegger et al., 1999).

Time delay coordinates and attractor reconstruction
The time series of DMWS is plotted in Fig. 1 which shows that the wind speed exhibits persistent temporal fluctuations.The underlying mechanism giving rise to these irregular fluctuations could either be stochastic or a deterministic system exhibiting chaotic behaviour.Prior to the advent of chaos theory through the pioneering work of Lorenz (1963), it was believed that random-like fluctuations such as the one in Fig. 1 could only originate from a stochastic system and not from a deterministic system.However, chaos theory has demonstrated that deterministic systems can also lead to behaviour that is quite complex and, like stochastic systems affected by noise, unpredictable in the long term.Deterministic dynamical systems, which evolve continuously over time, are described by a state vector x(t) and an equation of motion: Such systems are usually characterised by an attractor, which is a bounded subset of the phase space reached asymptotically by a set of trajectories over an open set of initial conditions as time t → ∞.
A striking feature of some dynamical systems is that the trajectories on the attractor may exhibit sensitive dependence on initial conditions.This means that trajectories starting from neighbouring initial conditions may separate from each other at an exponential rate, evolving independently of each other and in an apparently uncorrelated manner after a sufficiently long period of time, and yet remain confined to a bounded subset of the phase space.Chaos is the bounded aperiodic behaviour in a deterministic system that shows sensitive dependence on initial conditions (Alligood et al., 1997).A detailed illustration of the sensitivity to initial conditions of a chaotic system, particularly in the setting of atmospheric prediction, has been presented by Palmer (1993).The term chaos is reminiscent of the intricate dynamics experienced by the trajectories on the attractor; the exponential divergence stretches out the trajectories as it evolves in time, which is then folded back to remain confined to a finite region of the phase space.The attractor is the result of these sequences of stretching and folding repeated indefinitely.
Exponential divergence of trajectories on the attractor makes long-term predictions in a chaotic system difficult while the stretching and folding of trajectories cause measurements of quantities that depend on the state space to look random.Together, these attributes often make a chaotic system indistinguishable from a truly stochastic system.In fact, many systems that were earlier dubbed as stochastic were later shown to be chaotic (Alligood et al., 1997;Ott, 1993).
Most of the time the state vector x(t) is not measured directly but indirectly at discrete time intervals τ using a scalar measurement function y(t) = h(x(t)) leading to a time series y i = y(iτ ).The central idea in time series analysis is that the dynamics of the state vector x(t) on the attractor can be recaptured from the time series y(t) using a technique called attractor reconstruction, first suggested by Packard et al. (1980) and successfully used by many others.This technique is based on the fact that the dynamics of the n-dimensional state vector x(t) on the attractor is topologically identical to that of the m-dimensional delay vector: which was constructed from samples of y(t) taken at regular time intervals τ (also called "delay"), under the mapping x(t) −→ y(t) which is an embedding under rather general conditions.The embedding theorem of Takens (1981) and its extensions (Sauer et al., 1991;Sauer and Yorke, 1993) furnishes the mathematical theory behind reconstruction and asserts that the embedding is valid for almost all values of time delay τ and all smooth measurement functions h as long as m > 2D where D is the box-counting dimension of the attractor.This means that the dynamical and geometrical characteristics of the original system, in particular the geometrical invariants such as the fractal dimension, Lyapunov exponents and entropies, are preserved in the reconstructed space and can be computed from the flow defined by y(t) (Kantz and Schreiber, 1997;Ott et al., 1994).The analysis of the DMWS-data, presented in the next section, relies on attractor reconstruction for computing many such characteristics.

Analysis of the denoised data
For the purpose of our analysis of the DMWS-data, we heuristically assume that the dynamics underlying wind can be modelled by a deterministic system with state vector x(t).
The daily mean wind speed can be regarded as observations of a measurement function y(t) = h(x(t)) made at regular intervals.However, since we are considering the mean wind speeds and not the actual wind speeds at regular daily intervals, regarding them as observations made at equal intervals of time contributes an averaging error.We assume that these averaging errors along with other measurement errors can together be modelled as an additive noise process with zero average and delta correlation.It is therefore important to reduce the effect of this noise before analysing the data.For reducing noise we have applied the noise reduction method of Schreiber (1993), which employs a locally constant approximation of the dynamics to reduce noise.Despite the apparent random oscillations, recurring annual variations are evident in the times series plot (Fig. 1) of the DMWS-data.To confirm this we have plotted the space-time separation plot of the data (Fig. 2), which helps in identifying temporal correlations inside the time series (Provenzale et al., 1992).Each point in the plot represents a pair of points on the trajectory with their relative separation in time along the horizontal axis and separation in space along the vertical.Marked variations are observed in the graph at multiples of around 365 days indicating annual variations.The modulation effects due to these annual variations were reduced in subsequent analysis by applying epoch analysis on the data, by deducting from each of the data points which are 365 days apart their average value (Kumar et al., 2004).The resulting time series showed prominent variations, in periods of 28 days, arising from lunar influence.Hence, epoch analysis was repeated for these 28-day variations as well.The plot of the resultant denoised and detrended time series is shown in Fig. 3 and its space-time separation plot in Fig. 4 which clearly show considerable reduction in the effect of the annual and lunar variations.The autocorrelation function (Eq.3, discussed later) of the observed time series is plotted in Fig. 5a and of the detrended time series in Fig. 5b which also show that the temporal correlation due to the annual variation and lunar influence has significantly been reduced by the epoch analysis.As is clear from the Fig. 3, the denoised detrended data still show persistent temporal fluctuations.As a first step in the analysis of the denoised data, we determine the embedding parameters -the delay τ and the embedding dimension m -for the proper reconstruction of the attractor using the method discussed in the previous section.The embedding theorems do not place any restriction whatsoever on the delay τ and the embedding dimension m, but their choice can nonetheless affect the inferences deduced from reconstruction significantly, especially when the data come from experiment.Small delays, for example, result in highly correlated vectors y(t) leading to unduly larger values for the correlation dimension, while large delays yield vectors with fairly uncorrelated components resulting in data randomly distributed in the embedding space (Kantz and Schreiber, 1997).Proper choice of the time delay is, therefore, important, and a first guess of a suitable delay may be obtained from the autocorrelation function of the sample data y i given by where ȳ is the sample mean.The value of τ , at which the autocorrelation attains its first zero or its first local minimum, is usually an optimal choice for the delay (Kantz and Schreiber, 1997).
Another tool to determine an optimal delay, which takes into account non-linear correlations also, is the method of time-delayed mutual information suggested by Fraser and Swinney (1986).In this method, a quantity called average mutual information is computed for various delays as a measure of the predictability of y(t + τ ) given y(τ ).The mutual information I (τ ) for a given delay τ is calculated by regarding the sequences (y i ) and (y i+τ ) as values of random variables X and Y and using the formula where p(x, y) is the joint probability mass function of X and Y , and p(x) and p(y) are the marginals.The probabilities are calculated by constructing a histogram of the data points.
A good choice for time delay is then the value of τ at which the graph of mutual information exhibits a marked minimum.
For the DMWS-data, the plots of autocorrelation (Fig. 5b) and mutual information (Fig. 6) suggest a value around τ = 1 as an optimal choice for the delay.Our preliminary analysis with value τ = 2 also gave identical results for the choice of embedding dimension.
As for the choice of the embedding dimension m, it should be large enough for the attractor to fully unfold in the embed- ding space but choosing too large of an m may cause the various algorithms to underperform (Kantz and Schreiber, 1997).A practical method for choosing the right embedding dimension, proposed by Kennel et al. (1992), is to find the fraction of false neighbours as a function of the embedding dimension.False neighbours arise when the current dimension is not large enough for the attractor to unfold its true geometry, leading to crossing of trajectories due to projection onto a smaller dimension.The method checks the neighbours in progressively higher dimensions until it finds only a negligible number of false neighbours in passing from dimension m to m + 1.The first time the fraction of false neighbours attains a minimum indicates a suitable value for the embedding dimension.For the present data, Fig. 7 plots the fraction of false neighbours as a function of embedding dimension and it can be seen that an optimal choice of m must be higher than 13, since for m ≥ 13 the fraction of false neighbours becomes negligibly small.We have chosen m = 14 for the further analysis.It may be noted that, for most of the practical purposes, the important embedding parameter is the product mτ of the embedding dimension and the delay time because mτ is the time span represented by an embedding vector.Only a precise knowledge of m is required to exploit the determinism of the underlying dynamics with minimal computational effort (Kantz and Schreiber, 1997).The delay representation of the denoised detrended time series with m = 14 and τ = 1 is shown in Fig. 8.The definite structure in the Fig. 8 indicates the deterministic nature of the data.
A quantitative measure of the structure and self-similarity of the attractor is provided by various dimension estimates, such as the box-counting dimension, the Hausdorff dimension etc., all of which generalise the Euclidean definition of dimension to take care of the self-similar structure of chaotic attractors at arbitrary fine scales.As it turns out, chaotic attractors commonly have non-integer dimensions.various dimension estimates, the easiest to compute from a given time series, and which has also become the standard now, is the correlation dimension introduced by Grassberger and Procaccia (1983).The correlation dimension D 2 is defined in terms of the correlation integral C( ), which is defined as the probability that a pair of points chosen randomly on the attractor is separated by a distance less than .On the attractor, the correlation integral is empirically found to scale like C( ) ∝ D 2 as → 0, so that the correlation dimension may be estimated as the slope of the curve of ln C( ) versus ln( ) given by In practical computations involving a single time series and N data points of m-dimensional delay vectors y i , the correlation integral C( ) is approximated by the correlation sum C( , m) given by (Kantz and Schreiber, 1997) for sufficiently large N, where (a) = 1 if a > 0, (a) = 0 if a ≤ 0. The scaling exponent in Eq. ( 5), when calculated using the correlation sums C( , m), typically increases with m and saturates to a final value for sufficiently large m which is then taken as an estimate for D 2 .In practice, one computes the local slopes with the following equation: and plots them as a function of for various m; the value corresponding to a plateau in the curves is identified as an approximation to D 2 .There are, however, some subtleties to be taken care of in the computation of correlation dimension.While only the spatial closeness of points should be accounted for in Eq. ( 7), the actual computations may The convergence of the plateau for higher dimensions is also evident in Fig. 11a indicating evidence of low dimensionality.
be affected by the temporal closeness of points as well.To guard against this, points that are closer in time by less than a Theiler window ω -which is approximately equal to the product of the time delay and the embedding dimensionare excluded while calculating the correlation sum (Theiler, 1986).Hegger et al. (1999) have suggested that the value of ω should be chosen generously.Figure 9 plots the local slopes D 2 ( , m) for the DMWSdata with the previous choice of delay and for embedding dimensions ranging from 14 to 16 using 25 as value for Theiler window.The curves exhibit convergence for larger m, an indication of low dimensionality of the attractor, and suggest a value of D 2 = 3.7967 ± 0.0116.This shows that, while the original system may be affected by a multitude of factors, the eventual behaviour can be characterised by a lowdimensional attractor.
As mentioned previously, chaotic systems are characterised by their sensitive dependence on initial conditions, meaning that trajectories that start from neighbouring initial conditions may diverge exponentially over time.Let x 0 be any point on the basin of attraction, and consider an infinitesimal sphere of perturbed initial conditions.This sphere distorts into an ellipsoid as the system evolves in time (Alligood et al., 1997).Let k (t), k = 1, 2, • • • , n denote the length of the k-th principal axis of the ellipsoid.In general we can write k (t) = k (0)e λ k t where the λ k s may be positive, zero, or negative, and are called the Lyapunov exponents.The Lyapunov exponents quantify the average rate of divergence or convergence of nearby orbits, and the existence of a positive Lyapunov exponent is one of the most striking signatures of chaos (Ott, 1993).In such a system the growth of the separation δ(t) between two neighbour trajectories will be eventually dominated by the maximum Lyapunov exponent λ, so that δ(t)| = δ(0) e λt , and hence In practice one computes λ by plotting ln δ(t) versus t, which should fall nearly on a straight line, the slope of which then gives an estimate of λ.Lyapunov exponents are invariant under smooth transformations of the attractor; hence, they are preserved under delay reconstruction and may be estimated from a time series.There are many algorithms for estimating the maximal Lyapunov exponent from time series, all of which implement the above ideas to delay vectors in the embedding space.Most popular among them is the Kantz algorithm (Kantz, 1994;Kantz and Schreiber, 1997), which proceeds by computing the sum: for a point y n 0 of the time series in the embedded space and over a neighbourhood U (y n 0 ) of y n 0 with diameter .If the plot of S( n) against n is linear over small n and for a reasonable range of , and all have identical slope for sufficiently large values of the embedding dimension m, then that slope can be taken as an estimate of the maximum Lyapunov exponent (Kantz, 1994;Kantz and Schreiber, 1997).
For our time series, Fig. 10 shows curves of S( n) for m = 14, 15, 16 which increase linearly with n and then settle down.An estimate for the maximum Lyapunov exponent as obtained from the figure is λ = 0.0265±0.0008.The computations were repeated for various values of the embedding dimension and the diameter of the neighbourhood U (y n 0 ), all of which gave results identical to the above.The estimated positive value of the maximum Lyapunov exponent indicates that the underlying system is chaotic.
A colour noise time series can mimic many characteristics a chaotic time series.In order to make a distinction between these two, we compared the DMWS time series with its phase randomized time series.The phase randomization of a chaotic signal can destroy its profile, whereas a colour noise time series retains its profile (Pavlos et al., 1992).The phase randomized time series of DMWS data was obtained by representing it by Fourier series and then reconstructing the time series after adding a random phase distribution.We calculated the local slopes of the logarithm D 2 ( , m) of the correlation sum for both the original time series and the phase randomized time series and plotted the values in Fig. 11a and  b respectively.These figures clearly show that phase randomization destroys deterministic profile.
The estimated values of the correlation dimension and the fraction of false nearest neighbours obtained for various embedding dimensions show that the underlying dynamics of the fluctuations in the DMWS data is low-dimensional.The positive value of the maximum Lyapunov exponent indicates that the underlying system is chaotic.The comparison of the DMWS data with its phase randomized time series further confirms the chaotic nature of the underlying system.

Comparison with surrogate data
The analysis of the DMWS-data in the previous section reveals a number of features that are characteristic of time series originating from non-linear deterministic systems, which are chaotic.However, many of these features could also be exhibited by stochastic systems driven by a linear Gaussian process, which may possibly be distorted by some non-linear process.So to further validate the results of the previous section, we must ascertain that the source of the complex behaviour exhibited by the DMWS-data is not stochastic.The method of surrogate data (Theiler et al., 1992) is widely used as a tool for discriminating whether the source of random fluctuations in time series data is deterministic or stochastic.It is basically a statistical test to formally reject the hypothesis that the observed data convey a linear noise process.The method proceeds by first formulating a null hypothesis, which is usually an assumption that the observed data are random, and then generating an ensemble of time series of random numbers, called surrogate data, which are consistent with the null hypothesis and are otherwise similar to the original data.In other words, these surrogate data are what independent, repeated observations of the process that generated the original data would yield if that process were consistent with the null hypothesis.Then one compares the values of some discriminating statistic, such as correlation dimension, computed from the given data to the distribution of values obtained from the surrogates.If the values differ significantly, then the null hypothesis may be rejected.In what follows we apply the surrogate data analysis to test the null hypothesis that the observed time series is a linear Gaussian noise process.We used the algorithm of Schreiber and Schmitz (1996) to generate a set of 40 surrogates consistent with the null hypothesis.Generated by the amplitudeadjusted Fourier transform method, the surrogates preserve the amplitude distribution, power spectrum and autocorrelation of the DMWS-data, so that they can be regarded as what the realisations of the process underlying the DMWS-data would be like, if that process had the properties enjoined by the null hypothesis.The null hypothesis is tested using, as discriminating statistic, both geometrical and dynamical characteristics such as fraction of false nearest neighbours, the local slopes of the correlation sums and the curves of S( n ) which are related to the maximal Lyapunov exponent.Each of the above characteristics are calculated for both the original data and the surrogate data, and the null hypothesis is accepted or rejected depending on the value of the significance of difference given by (Mitschke and Dämmig, 1993; where µ and σ are the mean and standard deviation of the characteristic computed from the surrogates and µ orig is the mean of the characteristic on the original data.It is estimated that we may reject the null hypothesis with 95 % confidence if S > 2, which means that the probability is 95 % or more that the observed time series is not a realisation of a Gaussian stochastic process (Pavlos et al., 1999).Figure 12a plots the mean values of fraction of false nearest neighbours of all the surrogates and values one standard deviation away from the mean, alongside the values of fraction of false nearest neighbours of the DMWS-data.We can observe that the curves of fraction of false nearest neighbours versus m of all the surrogates deviate significantly from the corresponding curve of the original data for a range of the embedding dimensions.As shown in Fig. 12b, the significance of difference S for the fraction of false nearest neigh- bours reaches up to 9, and hence the null hypothesis can be safely rejected.
Next we compared the local slopes of the correlation sums.Figure 13a compares the local slopes D 2 ( , 14) of the correlation sums (Eq.7), of the DMWS-data with the mean values of the slopes of all the surrogates along with values one standard deviation away from the mean.It is clear that the values of the slopes of the surrogates deviate considerably from those of the original data, especially in the region of smaller .As is clear from Fig. 13b, the significance of difference is large enough to reject the null hypothesis.
We further compared the original data with their surrogates using S( n ) of Eq. ( 9) as the test statistic.Figure 14a compares the curves of S( n ) of the surrogates with those of the original data plotted for delay τ = 1 Theiler window ω = 25 and embedding dimension m = 14.We observe strong differences between the values of S( n ) corresponding to the original data and the surrogates.The significance of difference S, shown in Fig. 14b, is larger than 2 for all n ≤ 40.Here again, based on the values of S, we can reject the null hypothesis.
The alignment of the neighbouring segments of trajectories in a flow, which ultimately leads to a definite structure for the attractor if the dynamics is deterministic, can be used as a criterion to distinguish determinism from stochastic dynamics.A straightforward way to quantify this is the nonlinear prediction error which computes the deviations of the values predicted using past data from the actual values in the trajectory.It is reported that the non-linear prediction error is a consistently good tool for discriminating non-linearity (Schreiber and Schmitz, 1997).We calculated the prediction errors by using a locally constant approximation to predict future values (Tong, 1983;Hegger et al., 1999), and the rootmean-square prediction errors of each of the 40 surrogates and the original data were computed.The results are displayed in Fig. 15 which shows that the prediction errors are significantly lower for the original data than all the surrogates with S = 4.15, and hence the null hypothesis can be rejected.The time reversal asymmetry statistic defined by is frequently used as a measure of deviations from time reversibility which is a characteristic of linear systems.Figure 16 shows T rev for all the surrogates and the original data ,and it is seen that time reversal asymmetry of the original data is larger than that of the surrogates with S = 2.95; hence, we can reject the null hypothesis.
To summarise, based on the results of these series of statistical tests comparing the DMWS-data with their surrogates, we can reject the null hypothesis with 95 % confidence level and infer that the DMWS-data do not originate from a linear Gaussian process.This further confirms that the results reported in the previous section are not an artefact of a stochastic system but of a system that is indeed deterministic with a low-dimensional chaotic attractor.Although deterministic, the chaotic nature of the data makes long-term predictions prone to errors, but short-term predictions can be made with fairly good accuracy by carefully chosen methods adapted to the data.The average prediction errors of DMWS-data based on a locally constant approximation are shown in Fig. 17 as function of embedding dimension.As is clear from the figure, the prediction error becomes smaller and stabilised for embedding dimension m ≥ 14 which, besides being a further justification for our choice of m = 14 in the previous analysis, furnishes another piece of evidence for the determinism in the data.However, the locally constant approximation is by no means the most suitable for all types of data, and a proper choice of prediction method requires a careful analysis of the data against the various prediction schemes.This will be addressed in a future work.

Conclusions
We have carried out a detailed analysis of the daily mean wind speed measured at Thiruvananthapuram from 2000 to 2010 using tools of non-linear time series analysis.The purpose of the study was to examine whether the persistent irregular temporal fluctuations exhibited by the data arose from deterministic or stochastic dynamics of the underlying system.The analysis reveals that the underlying dynamics of DMWS-data is deterministic, low-dimensional and chaotic.The estimated values of correlation dimension and the fraction of false nearest neighbours as a function of embedding dimension indicate the low dimensionality of the system, and the positive value of the maximum Lyapunov exponent shows that the system is chaotic.The reduction and stabilization of prediction errors with increase of embedding dimension is further evidence for determinism.A detailed surrogate data analysis, using a number of measures as discriminating statistic, shows that the characteristics shown by the data are not of a stochastic system exhibiting chaos-like behaviour, and corroborates the deterministic character of the system.The analysis further shows that the chaotic profile does not arise from the pseudo-characteristics of a colour noise time series.While most of the chaotic systems reported in the literature are confined to laboratories, this is a natural system showing chaotic behaviour.
Topical Editor P. Drobinski thanks two anonymous referees for their help in evaluating this paper.

Fig. 1 .
Fig. 1.Time series of the measured daily mean wind speed (DMWS) in knots.

Fig. 6 .
Fig. 5. (a) The autocorrelation function of the observed DMWS data.(b) The autocorrelation function of the detrended time series.

Fig. 7 .
Fig. 7.The fraction of false nearest neighbours as a function of the embedding dimension m for the detrended time series with τ = 1, ω = 25, showing that any m ≥ 13 can be considered optimal.

Fig. 8 .
Fig. 8.The delay representation of the detrended time series.

Fig. 9 .
Fig.9.The local slopes D 2 ( , m) for the detrended time series for m ranging from 14 to 16 with τ = 1, ω = 25 giving a plateau for small values of and giving an estimate of D 2 = 3.7967 ± 0.0116.The convergence of the plateau for higher dimensions is also evident in Fig.11aindicating evidence of low dimensionality.
Fig. 12.(a) The mean values of the fraction of false nearest neighbours of the surrogates with standard deviation.(b) Plot of the significance of difference S versus m.
Fig. 13.(a) The mean values of the local slopes of the surrogates with standard deviation.(b) Plot of the significance of difference S versus .Here the normalised data sets are used, and m = 14, τ = 1, and ω = 25.
Fig. 14.(a) The mean values of S( n) of the surrogates with standard deviation.(b) Plot of the significance of difference S versus n.

Fig. 15 .Fig. 16 .
Fig. 15.The plot of the prediction errors for m = 14, τ = 1 for the surrogates (denoted by circles) and that of the original series (denoted by filled square), showing the determinism in the time series.The significance of difference S = 4.15.

Fig. 17 .
Fig. 17.The plot of the prediction error versus embedding dimension m.