Dendroclimatic transfer functions revisited: Little Ice Age and Medieval Warm Period summer temperatures reconstructed using artificial neural networks and linear algorithms

Tree-rings tell of past climates. To do so, tree-ring chronologies comprising numerous climate-sensitive livingtree and subfossil time-series need to be “transferred” into palaeoclimate estimates using transfer functions. The purpose of this study is to compare different types of transfer functions, especially linear and nonlinear algorithms. Accordingly, multiple linear regression (MLR), linear scaling (LSC) and artificial neural networks (ANN, nonlinear algorithm) were compared. Transfer functions were built using a regional tree-ring chronology and instrumental temperature observations from Lapland (northern Finland and Sweden). In addition, conventional MLR was compared with a hybrid model whereby climate was reconstructed separately for shortand long-period timescales prior to combining the bands of timescales into a single hybrid model. The fidelity of the different reconstructions was validated against instrumental climate data. The reconstructions by MLR and ANN showed reliable reconstruction capabilities over the instrumental period (AD 1802–1998). LCS failed to reach reasonable verification statistics and did not qualify as a reliable reconstruction: this was due mainly to exaggeration of the low-frequency climatic variance. Over this instrumental period, the reconstructed low-frequency amplitudes of climate variability were rather similar by MLR and ANN. Notably greater differences between the models were found over the actual reconstruction period (AD 802–1801). A marked temperature decline, as reconstructed by MLR, from the MeCorrespondence to: S. Helama (samuli.helama@helsinki.fi) dieval Warm Period (AD 931–1180) to the Little Ice Age (AD 1601–1850), was evident in all the models. This decline was approx. 0.5 C as reconstructed by MLR. Different ANN based palaeotemperatures showed simultaneous cooling of 0.2 to 0.5C, depending on algorithm. The hybrid MLR did not seem to provide further benefit above conventional MLR in our sample. The robustness of the conventional MLR over the calibration, verification and reconstruction periods qualified it as a reasonable transfer function for our forest-limit (i.e., timberline) dataset. ANN appears a potential tool for other environments and/or proxies having more complex and noisier climatic relationships.


Introduction
Palaeoclimatology is a subdiscipline of earth sciences.As distinct from climatology and meteorology, palaeoclimatic research employs indirect estimates of past climate -proxy records -which are climate-sensitive series of geological, glaciological or palaeontological records or historical archives.These include, for example, ice-cores, borehole temperatures, farmers' diaries, mountain glacier advances and retreats, microfossil and fossil evidence, lake varves and tree-rings (Bradley, 1999).Debate on the current increase in temperatures was initiated by instrumental climate records (Jones et al., 1999;Folland et al., 2001;Jones and Published by Copernicus Publications on behalf of the European Geosciences Union. S. Helama et al.: Dendroclimatic transfer functions revisited Moberg, 2003), which are not, however, long enough to resolve whether the temperature rise is unprecedented.Proxy series extend climatic records over the past centuries and millennia and thus strengthen our understanding about climate variability prior to any direct weather observation.Recent literature has studied the climate history of the past millennium extensively by means of tree-rings (Jacoby and D'Arrigo, 1989;D'Arrigo and Jacoby, 1993;Briffa et al., 2001Briffa et al., , 2002a, b;, b;Esper et al., 2002;D'Arrigo et al., 2006).Moreover, treerings have played an important role as constituents of multiproxy palaeoclimate reconstructions in other studies (Jones et al., 1998;Mann et al., 1999;Crowley and Lowery, 2000;Osborn and Briffa, 2006).All the aforementioned studies have been targeted to reveal the past fluctuations of climate for large parts of the Northern Hemisphere.Furthermore, tree-rings have served as proxy records for regional and subhemispheric scale studies (Briffa et al., 1990(Briffa et al., , 1995;;Lara and Villalba, 1993;Cook et al., 2004).
Very long tree-ring chronologies are typically composites of biological and palaeontological tree-rings (Pilcher et al., 1984;Eronen et al., 2002;Grudd et al., 2002;Hantemirov and Shiyatov, 2002;Naurzbaev et al., 2002;Spurk et al., 2002).Alternatively, tree-ring chronologies can be constructed from abundant collections of fossilized wood material as temporally "floating" chronologies (e.g., Roig et al., 2001).In fact, tree-rings have several benefits compared to other proxy archives of past climates.First, temporal control of tree-ring chronologies is ensured by the usage of hundreds of individual tree-ring series that are cross-compared and averaged into one mean series in which each ring can be reliably dated to calendar years (Douglass, 1941;Schulman, 1956;Fritts, 1976).Second, tree-rings can be directly calibrated against the instrumental weather records and further transformed into estimates of palaeoclimate variables using time-series analysis.Importantly, this allows for the adjustments of autocorrelation structures between proxy and instrumental series (Fritts, 1976;Hughes et al., 1982;Cook and Kairiukstis, 1990).
Tree-ring growth is controlled by factors that are either internal or external to the growth system (Fritts, 1976;Schweingruber, 1988).External factors include climate, especially temperature and precipitation variations.In a situation where only a single climatic factor is found to control the growth markedly more than any other factor, there is a potential that the past variability of this particular climatic variable could be dendroclimatologically reconstructed.Detection of climatic variables that influence growth can be done using a spectrum of statistical techniques.Some of the methods may be relatively simple, but the increasing calculation power of microcomputers has given a rising potential for a wider array of mathematical techniques with highly sophisticated algorithms to be used.Traditionally, the methods comparing treerings and climate have dealt with linear models (Fritts, 1962(Fritts, , 1976;;Fritts et al., 1971Fritts et al., , 1986)).Nonlinearity in the climategrowth relationships could thus invalidate these models.
The models that metamorphose tree-ring chronology -or a pool of chronologies -into estimates of climatic variability are called transfer functions (Fritts, 1976).Natural timeseries can be markedly autocorrelated and tree-rings make no exception to this rule (Fritts, 1976;Cook, 1985;Guiot, 1986).Temporally operating transfer functions make use of proxy values that are concurrent with climate value.It is noteworthy that the autocorrelation of tree-rings can be much higher than that of the climate.Inclusion of leading and/or lagging tree-ring values in the transfer function is to correct for statistical autocorrelation effects that arise from physiological processes (inherent to growth) that carry the concurrent climatic signal over a number of years (Fritts, 1976).Roughly speaking, the growth is influenced by the concurrent climate but it also results from physiological constraints owing to previous climate-growth associations that, in turn, are reasonably imprinted in the leading values of tree-rings.Similarly, the signal of concurrent climate is stored in the growth over a number of subsequent years, this influence being registered by the lagging values of tree-rings.Basically, the inclusion of leading and/or lagging tree-ring values in the transfer function is to modify the autocorrelation structure of the proxy to better mimic the climatic autocorrelation structure.In practice, such an inclusion of several serially related proxy values into the transfer function as independent variables can be done using multiple linear regression (MLR) (e.g., Fritts, 1976).However, a specific caveat associated with regression-based transfer functions is that a lack of statistical fit between the proxy and climate may lead to a reduction in the reconstructed climate variance (Osborn and Briffa, 2004;von Storch et al., 2004;Esper et al., 2005).This purely statistical bias is, at least partially, alleviated by using linear scaling (LSC) to replace regression (see Esper et al., 2005).
Although linearity is a commonly accepted basis for transfer functions, their fidelity could be impugned with regard to their ability to model the true underlying natural processes that may, in fact, behave more or less nonlinearly.Accordingly, unexplained variance in the linear models could, at least tentatively, be attributed to nonlinear relationships between the proxy and climate.Numerical techniques to study nonlinear relationships are provided by artificial neural networks (ANN) (Bishop, 1996;Gorban, 1998).Compared to linear algorithms, ANN learns by mapping the input data X= (x 1 , x 2 , ...x m ) into the output of ANN Y = (y 1 , y 2 , ..., y n ).To do so, the pair (X, Y ) is known as a training set and the map is thus F :X→Y .The versatility of ANN is enhanced since there are several methods to approximate the map.The simplest approach is a linear approximator Y ≈AX, where A is a constant matrix.Another approximator is Y ≈ i w i ϕ i (X), where {ϕ i } is a set of basic functions.ANN realizes an approximator as a superposition of a nonlinear standard function, σ , which is an activation function of the formal neurons: where w= w (k) i are the weights of neurons and every sum relates to the k-th layer of ANN.The first layer is an input layer k=1.This layer is followed by the hidden layers and with the final layer that will be the output.Training is a modification of the weights to minimize an error Y − F (w, X) . (2) Compared to MLR and LSC, a clear benefit of ANN is the potential to find not only linear but nonlinear relationships between the climate and the proxy, such as tree-rings.Previously, Keller et al. (1997) used ANN to study the influence of pollution and CO 2 on tree-ring growth in south-eastern France.Zhang et al. (2000) modelled the tree-ring growth response to different climatic variables using ANN in southern Vancouver Island in Canada.Carrer and Urbinati (2001) compared linear and nonlinear relationships between climate and tree-ring growth in the Italian Alps using ANN as a nonlinear algorithm.The feasibility of ANN in tree-ring based palaeoclimatic studies has previously been demonstrated and discussed by Woodhouse (1999), D'Odorico et al. (2000) and Ni et al. (2002).Woodhouse (1999) compared linear and ANN-based reconstructions of spring precipitation for the Colorado Front Range for the past 300 years.D 'Odorico et al. (2000) reconstructed the Palmer Hydrologic Drought Index for New Mexico and south-eastern Virginia.Ni et al. (2002) compared the cool-season precipitation reconstructions constructed by linear regression and ANN in Arizona and New Mexico since AD 1000.Recently, Guiot et al. (2005) used ANN in a multi-proxy reconstruction of summer temperatures in western Europe.Their work was based on several different proxies, including treerings, and spanned roughly the past millennium.In fact, ANN has shown to have several benefits over traditional (i.e., linear) reconstruction techniques.There appear however some issues relating to over-fitting problems with regards to ANN-based transfer functions (Woodhouse, 1999).Interestingly, dendroclimatic reconstructions based on linear and nonlinear transfer functions have not yet been compared for palaeotemperature proxies such as temperature-sensitive tree-ring chronologies of millennial length.
The main purpose of this paper is to explore the differences in linear and nonlinear transfer functions.Linear and nonlinear models are compared over the calibration and verification periods.Calibration statistics reveal the ability of the model to fit the data whereas verification statistics show how well the model predicts variability over an independent period withheld from the calibration.The aim of this comparison is to determine the advantages and disadvantages of the different transfer function algorithms and to better understand the potential discrepancies that may be attached to reconstructions of past climate variability.Due to the similarity of dendrochronological time-series to other palaeoclimatic proxy records such as sclerochronologies and fish otolith series (Strom et al., 2004;Black et al., 2005;Helama et al., 2006Helama et al., , 2007a)), we hypothesize that our results could provide information for studies on growth increments of modern and fossilized corals, bivalves and fishes and their relationships to climate.S. Helama et al.: Dendroclimatic transfer functions revisited wood from the forest-limit region of northern Finnish Lapland (within 70 • to 68 • N, and 30 • to 20 • E) (Eronen et al., 2002;Helama et al., 2004b).Measurements and cross-dating of the tree-ring series were processed using standard procedures (Holmes, 1983).Cross-dating is a prerequisite for all dendrochronological studies where the temporal growth joint-variability of wide and narrow tree-ring widths is synchronized among a number of available sample series.By so doing, cross-dating results in absolutely dated sample series with a dating precision of one calendar (sidereal) year of each tree-ring (Fritts, 1976).
Each ring-width series was detrended in order to remove the long-term non-climatic growth variability.Annual growth values of detrended series were averaged into one mean chronology, which was then used in comparison with climate.The Finnish Lapland tree-ring chronology extends continuously over the past 76 centuries being one of the longest tree-ring chronologies worldwide and covering the mid and late Holocene (Eronen et al., 2002).In this study, we concentrate our analyses on the recent part of the chronology.To do so, the Finnish Lapland tree-ring chronology between AD 800 and 2000, was adopted from Helama et al. (2005b).
Previous studies have shown that the pine tree-ring variability is predominantly controlled by mid-summer (July) temperatures in this region (Hustich and Elfving, 1944;Sirén, 1961;Lindholm, 1996;Helama et al., 2004a).In the present study, the instrumental weather data from the meteorological station of Karasjok in northern Norway (69 • 28 N; 25 • 31 E) was used for calibration of linear and nonlinear transfer functions.The mean temperature of July (AD 1876(AD -1998) ) was used as a predictand.Longer than the Karasjok series of instrumental climate records, a multi-site composite series was available from more southern sources (approx.66 • N) (Klingbjer and Moberg, 2003;Helama et al., 2004;Holopainen, 2006).This record covers the interval from AD 1802 to the present.As this record originates approx.300 km south of our study region, the higher mean and lower standard deviation there were first rescaled to correspond to the values of the northern temperature variability in Karasjok prior to proxy comparison.This record was used as verification data (AD 1802-1875) for the reconstructed temperature history.

Calibration of models
Four different types of transfer functions were utilized and compared.These include two linear and one nonlinear method: multiple linear regression (MLR), linear scaling (LSC) and artificial neural networks (ANN), respectively.In addition, a hybrid MLR, in which the different time-scales were first reconstructed separately and then combined into a single dendroclimatic model, was explored.
In MLR, transfer functions are typically based on an iterative fitting process by linear multiple regression that is evaluated by the sums of least squares of model residuals.Pre-viously, Helama et al. (2007bHelama et al. ( , 2009) created a reconstruction of mid-summer temperatures for the region using the present tree-ring chronology.Their study made use of MLR, that is, the climate in year t was estimated using tree-ring values of previous, concurrent and forthcoming years (t−2, t−1, t, t+1, t+2).The rationale of using a model of multiple growth years is due to high positive autocorrelation that typically characterizes the dendrochronological data (Fritts, 1976).The use of MLR with lagging and leading tree-ring values is to statistically simplify the rather complex relationships between tree-rings and climate over a number of growth years and to stabilize the skill of the transfer function in modelling these relationships.Thus, in addition to concurrent tree-ring value, the MLR based dendroclimatic transfer functions usually contain tree-ring values from one to three lagging and leading growth years (Briffa et al., 1988(Briffa et al., , 1990;;Kalela-Brundin, 1999;Lindholm, 1996;Lindholm and Eronen, 2000;Kirchhefer, 2001Kirchhefer, , 2005)).Here we adopted the aforementioned transfer function of Helama et al. (2007) as a linear analogue to nonlinear ANN, and as a multi-variable model to compare with LSC.
Another transfer function explored here was LSC.Linear scaling of the tree-ring chronology to the mean and variance of climatic time-series, in the present case mean July temperatures, is a simple arithmetic adjustment by adding and dividing each annual tree-ring value by constants that determine the relationship between the two time-series.Such a way of transforming tree-ring variability into a palaeoclimatic reconstruction has recently been introduced to palaeoclimatic studies (e.g., Esper et al., 2005).As a benefit, the variance of the reconstructed climate is not artificially reduced by LSC, a pitfall that may occur in poorly fitting regression.The apparent disadvantages of the method are the absence of statistical confidence intervals and the incapacity of LSC to modify the autocorrelation structure of the tree-ring series.
Nonlinear transfer functions explored were due to ANN.Training of the ANN was performed here using the feedforward ANN and back-propagation algorithm with the gradient descending method without regularization (Bishop, 1996;Gorban, 1998) for the approximation of the function in which the input data included five tree-ring values (TRW), those of the two previous years (t−2, t−1), concurrent with climate (t), and two following years (t+1, t+2).The output of ANN was the mid-summer temperature T during the year t.The linear approximator Y ≈AX, where A is a constant matrix, is the theoretical analogue to MLR.That is how the function F could be seen to bridge the gap between the linear and nonlinear transfer functions, MLR and ANN, respectively.
One more transfer function to be explored was a hybrid MLR.Guiot (1985) already suggested improving the quality of dendroclimatic and palaeoclimatic reconstructions by calibrating the short and long time-scales separately.This Ann.Geophys., 27,[1097][1098][1099][1100][1101][1102][1103][1104][1105][1106][1107][1108][1109][1110][1111]2009 www.ann-geophys.net/27/1097/2009/would make sense due to a possibility that tree-ring variability may not be related to climate proportionally on different time-scales (Guiot, 1985).In this method, the different frequency bands of the total variability are first to be extracted from the total variability, calibrated separately and finally combined into one palaeoclimate reconstruction.A simple way to separate the different frequencies is to apply some type of "low-pass" filter to the time-series.In so doing, the filtered (smoothed) values would act as the lowfrequency component whereas the residuals from the filter would act as the high-frequency component of the total variability (Guiot, 1985;Osborn and Briffa, 2000;Rutherford et al., 2005).Here we applied a cubic smoothing spline (Cook and Peters, 1981) having a 20-year frequency response with 50 percent cut-off to the climate and tree-ring series.Thus, the spline provided a filter to isolate the different high-and low-frequency components from the total variance time series (S TOTAL ) as: where S LF was the low-frequency (long-term and longperiod) component of the variability in the climate and proxy series (spline curve).S HF was the high-frequency (shortterm and short-period) variability as residuals (by subtraction) from the low-frequency curve.Initial reconstruction of the low-frequency variability utilized the concurrent (t) tree-ring value as a solitary predictor.The reconstruction of the high-frequency variability was done as a function of treerings of lagging (t−2, t−1), concurrent (t) and leading (t+1, t+2) growth years, similarly to conventional MLR.All the types of transfer functions were calibrated (trained) over the late meteorological period AD 1876-1998.The quality of each calibration model was tested using Pearson correlation (R) and the root mean square error (RMSE).The latter is a measure of the difference (model error) between the observed and modelled values and is defined by: where O t is the observed temperature in year t, P t is the reconstructed temperature in year t and N is the number of years in the period that was used to calculate the statistic.

Verification of models
Subsequent to calibration of the model, its veracity was tested using the independent data that was not used in the training process.Verification of the model using withheld data is of special importance as the comparison between the calibration and verification statistics may reveal potential over-fit (that is, depressed verification statistics contrasted with inflated calibration statistics).In order to verify the transfer functions, they were applied over the period AD 1802-1875 and several verification tests were used to quantify the different predictions.Verification was performed using Pearson correlation, the root mean square error, first difference sign test, reduction of error and coefficient of error statistics.We also compared the means and standard deviations of the observed and predicted temperatures over the verification period.Reduction of error (RE) and coefficient or error (CE) statistics (Fritts, 1976;Briffa et al., 1988) are defined as: where o t is the observed temperature in year t, p t is the predicted (reconstructed) temperature in year t and O t and o t are the means of the observed temperatures over the calibration and verification periods, respectively.N is the number of years in the verification period.Statistics (RE and CE) have a maximum value of 1.00 indicating a perfect fit between the observed and predicted time-series.Any positive value of RE and CE indicates that the reconstruction has some predictive skill.
The first difference sign test (FDST) is a verification statistic that gives the ratio of correct and incorrect estimates of temperature change from one year to the next (Fritts, 1976).This statistic can thus be considered as a special measure of goodness of fit between the observed and reconstructed values for high-frequency variation.

Comparisons over the pre-instrumental period
Subsequent to calibration and verification of the transfer functions, the palaeotemperature variability was reconstructed for the region since AD 800 using those models that passed the verification tests.The importance of this time period lies in its climatic characteristics due especially to the multi-centurial warm and cool intervals often called the Medieval Warm Period (MWP) (Lamb, 1965;Crowley and Lowery, 2000;Bradley et al., 2003) and the Little Ice Age (LIA) (Robock, 1979;Grove, 1988;Bradley and Jones, 1993;Matthews and Briffa, 2005), respectively.A previous study that made use of the identical set of tree-rings reconstructed MWP and LIA summer temperatures in the region using MLR only (Helama et al., 2007b(Helama et al., , 2009)).The predetermined intervals for the MWP and the LIA were AD 930-1180 and AD 1601-1850, respectively, during which the intensification of these climatic reversals could be dated in the region (Helama et al., 2007b(Helama et al., , 2009)).Autocorrelation functions from first to tenth order for treerings, observed temperatures (mean July temperatures), and reconstructed temperatures using multiple linear regression (MLR), artificial neural networks (ANN1 in Table 1) and linear scaling (LSC).Autocorrelations were computed using the common interval 1876-1998.
The year AD 1601 is known as one of the severest volcanic signature year of the past centuries as that was characterized by a cooling over the Northern Hemisphere due to the eruption of Huaynaputina (Peru) that had occurred in the previous year (Briffa et al., 1998;de Silva and Zielinski, 1998).This signature year was previously found in the tree-rings of the study region (Lindholm and Eronen, 2000;Helama et al., 2002Helama et al., , 2005a)).We compared the skill of different transfer function models to reconstruct the temperature change of extreme amplitude during that year.In so doing, the AD 1601 temperature anomaly was calculated relative to the mean temperature of the calibration period for each model.

Autocorrelation structures
Autocorrelation functions (ACF) were first computed for the tree-ring chronology and the instrumental temperatures from first to tenth order.It was found that the annual values of tree-rings were serially correlated (Fig. 1).The highest coefficient of autocorrelation was reached at lag one, with subsequently declining autocorrelations as a function of increasing lag.Moreover, ACFs of tree-rings and climate were clearly different.In general, the temperature record showed considerably lower autocorrelation than tree-rings.Relatively high coefficients were found for climate at lags three and seven (Fig. 1).
MLR-and ANN-based reconstructions showed ACFs that greatly mimicked the ACF of instrumental temperatures (Fig. 1).Strikingly, the MLR-and ANN-based climate reconstructions showed high coefficients at lags three and seven.Overall, ACFs of these reconstructions showed positive and negative coefficients with similar lags compared to the ACF of instrumental temperatures.By contrast, LSC did not introduce a similar beneficial change into ACF (Fig. 1).Actually, the ACF of LSC-based temperature estimates remained unaltered and was thus identical to that of the original tree-ring chronology (Fig. 1).

Calibration and verification of the models
Visual comparison of the instrumental and proxy-based climate variability did not provide any particular reason to directly disallow any of the reconstruction models (Fig. 2a).Moreover, all the different reconstructions correlated markedly well (Table 1a).Transfer functions (MLR, ANN and LSC) were further examined by comparing the similarity of reconstructed and observed temperatures over the dependent and independent time intervals by several statistics (Table 2).Judged by simple correlations and RMSE, the reconstruction by LSC had a notably lower skill of calibration than the reconstructions by MLR and ANN.(AD 802-1998) between the reconstructions by multiple linear regression (MLR), different algorithms of artificial neural network (ANN), linear scaling (LSC), and hybrid MLR.Correlations were calculated separately for total range of variability (a) and at decadal and multi-decadal time-scales (b).The latter time series were produced by smoothing them using 15-year cubic splines (with 50 percent cut-off).Mean inter-series correlations were 0.92 (a) and 0.90 (b).

(a)
MLR ANN1 ANN2 ANN3 ANN4 ANN5 ANN6 ANN7 ANN8 Conversely, both MLR and ANN provided reduced amplitudes of reconstructed temperature variability compared to LSC, which preserved the temperature variance reasonably with identical amplitudes for total variability over the calibration period (see SD total for calibration period in Table 2).This feature could be seen as a particular benefit of using LSC.
Over the verification period, the reconstructions by MLR and ANN passed all the statistical tests and thus validated as reasonable models of reliable reconstruction skill.Verification correlations were, in general, slightly higher for ANN than MLR, but the sign test showed improved skill by MLR.Meanwhile, the reconstruction by LSC failed when evaluated by RE and CE.This result was understandable since the LSC based temperature level was approx.1.5 • C lower than the instrumental mean temperatures over the verification period (Table 2).

Temperature variations at multi-decadal timescales
Further analyses were performed in order to reveal the skill of the different transfer function models in reconstructing the temperature variability over long time-scales, that is, at low frequencies.To do so, the records of observed and reconstructed temperatures (Fig. 2a) were smoothed using 15-year cubic spline functions to portray the climate variability at time-scales of decades and longer.The "low-frequency" filtered reconstructions produced this way showed high correlations (Table 1b).However, notable differences were observed in the amplitudes of the reconstructed temperature fluctuations over the instrumental period (Fig. 2b).MLR-and ANN-based reconstructions generated a lowfrequency variability with reduced amplitudes compared to observed temperatures.This was true both for calibration and verification periods (judged by SD low−f in Table 2).Conversely, the low-frequency temperature amplitudes were overestimated by LSC over both calibration and verification periods.And as a matter of fact, the inflated temperature amplitudes -too low temperatures during the 19th century in Table 2. Calibration, verification and reconstruction statistics for the three types of transfer functions, linear regression (MLR), different algorithms of artificial neural networks (ANN) and linear scaling (LSC) as well as observed temperatures.Statistics for calibration include Pearson correlation (denoted as R over calibration period and r for verification period), root mean square error (RMSE; Eq. 5) and mean (Mean) and standard deviation (SD total ) of the temperature variability.SD was further calculated from the data filtered by 15-year cubic splines to portray the low-frequency variability (SD low−f ).Additional statistics for verification include reduction of error (RE; Eq. 6), coefficient of error (CE; Eq. 7) and first difference sign test (FDST) with correct (C) and incorrect (I) differences.In addition, the temperature anomaly (relative to calibration mean) during the volcanic year AD 1601 as well as the mean temperatures for the Medieval Warm Period (MWP) and the Little Ice Age (LIA) and the reconstructed temperature difference from the former to the latter, were computed.interval 1918interval -1942interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1921interval -1945interval 1920interval -1944interval 1921interval -1945interval Coolest interval 1601interval -1625interval 1601interval -1625interval 1601interval -1625interval 1601interval -1625interval 1767interval -1791interval 1601interval -1625interval 1695interval -1719interval 1601interval -1625interval 1601interval -1625interval 1601interval -1625  particular -caused the failure of the LCS-based reconstruction as evidenced by verification statistics.

Palaeotemperature reconstructions since AD 800
All exploited models demonstrated the overarching warmness of the 1920s and 1930s in the region with the warmest reconstructed 25-year mean temperatures during these decades.The coolest 25-year intervals were reconstructed for AD 1601-1925 by nine of the eleven models (Table 3a).
Further examination was carried out by examining the reconstructed temperature amplitudes on centurial time-scales.Most of the models demonstrated the long-term warmness of the multi-centurial period since the 930s through the 1180s.Likewise, the coolest 250-year intervals were reconstructed for AD 1601-1850 or 1600-1849 by ten of the eleven models (Table 3b).That is, the comparisons between the warmest and the coolest 250-year periods and their reconstructed temperature amplitudes revealed the pre-anthropogenic climatic change from the Medieval Warm Period (MWP) towards the Little Ice Age (LIA) (Fig. 3).The Majority of the models exhibited a multi-centurial temperature decline of 0.4 to 0.5 • C from MWP to LIA albeit a few of the reconstructions by ANN appeared to underestimate this temperature change (Table 2).LCS provided much larger amplitude of change (as high as 1.5 • C), but this was, in all likelihood, due to similarly inflated temperature amplitudes by this particular model evident already in the calibration-verification exercise (see Sect. 3.3.),a clear overestimation.The volcanic signature year AD 1601 was reconstructed as a strongly negative temperature departure by all of the models (Table 2).The deviation from the mean of the calibration period by 4.4 • C by MLR could be compared with the average of 4.2 • C by different models of ANN.The greatest temperature departure was produced by LSC but since the model did not pass the verification tests, the value of this estimate appears questionable.

Conventional and hybrid regression models
Finally, the hybrid MLR model was compared with the conventional MLR (Table 2).The hybrid model performed reasonably well over the calibration and verification periods, importantly, passing all the verification statistics.Moreover, the hybrid MLR reconstructed the centurial cooling from MWP to LIA comparably to conventional MLR with regard to its amplitude.Furthermore, the reconstructed amplitude was similar or larger than the long-term temperature change by different ANN algorithms.Compared to conventional MLR, the benefits of using hybrid MLR appeared negligible.

Tree-rings as proxy for climate
A number of previous papers have examined some of the problems attached to the process of transforming proxy variability into climatic reconstructions.These studies have especially dealt with the palaeotemperature reconstructions and the climate history of the past millennium.Methodologically, the comparisons have been made between regression and other linear methods (Osborn and Briffa, 2004;von Storch et al., 2004;Esper et al., 2005;Rutherford et al., 2005).While these works have considered the problems associated with hemispheric palaeoclimate reconstructions, their general conclusions could play an important role in interpreting proxy-climate relationships on smaller spatial scales.In general, it has been suggested that regressionbased reconstructions result in reduced temperature change amplitudes due purely to methodology (Osborn and Briffa, 2004;von Storch et al., 2004;Esper et al., 2005).However, S. Helama et al.: Dendroclimatic transfer functions revisited Rutherford et al. (2005) emphasized the primary importance of proxy quality for the skill of reconstruction.In their example, the hemispheric tree-ring network outperformed the multi-proxy dataset of similar geographical extent in warmseason verification tests (Rutherford et al., 2005).Compared to hemispheric reconstructions, the special feature of regional studies dealing with proxy-climate relationships is that they are carried out on restricted spatial scales and, therefore, the proxy record(s) can be ideally related to the climate series directly from nearby meteorological stations with no spatial complications.Therefore, the actual seasonal response of the proxy can be determined (here, the mean temperature of July) and the reconstruction can thus be derived using optimal meteorological variables.By contrast, those reconstructions that work on larger spatial scales and possibly make use of multiple proxies, may actually be reflecting the underlying climate in a more complex manner due, for example, to arbitrarily selectable spatial domains of instrumental data and different seasonal sensitivity of different proxies to climate (Rutherford et al., 2005).The present study exemplified the influence of different transfer function algorithms in reconstructing regional summer temperatures using a given set of forest-limit tree-rings.Conventional MLR was compared to ANN and LSC as well as to a hybrid MLR.Below, we discuss the current findings in the context of previous palaeoclimatic studies with special emphasis on dendroclimatology and climate history over the past millennium.
A prominent divergence between the climatic and proxy autocorrelations (Fig. 1) is one of the key issues in understanding the requirements of dendroclimatic transfer functions.This issue relates to the fact that tree-rings are biogenic palaeoclimate proxies.Inevitably, the climate of the growth year (t) has the greatest influence on tree-ring formation during the same year, but the climatic impact is carried over a number of growth years (Fritts, 1976).Statistically speaking, consecutive tree-rings do not occur independently but the tree-ring value of the successive years can be predicted using the growth values from previous years (t−n) (Cook, 1985;Guiot, 1986).
The ACF of instrumental climate showed considerably lower autocorrelations than tree-ring chronology (Fig. 1).This indicated that the higher serial dependence in tree-rings is largely originating from non-climatic factors, owing at least partly to factors that are biogenic.In MLR and ANN, the proxy autocorrelation can be taken into account by including tree-ring values from lagged (t−n) and/or leading growth years (t+n) in the transfer function model.As a result, the ACFs of reconstructions by MLR and ANN mimicked the ACF of instrumental climate (Fig. 1).By contrast, LSC did not modify the proxy autocorrelation structure.As a consequence, the palaeoclimate reconstruction by LSC was not expected to follow the year-to-year variability in climate as accurately as reconstructions by MLR and ANN, but rather the LSC-based palaeotemperatures are degraded by biogenic variability.Accordingly, LSC provided clearly lower correlations over the calibration and verification periods compared to MLR and ANN.An additional pitfall of LSC as a palaeoclimate transfer function is the potentially inflated error variance that is eliminated in MLR (Esper et al., 2005) and ANN (this study).Simple scaling performed for differently autocorrelated series, i.e., those of tree-rings and actual temperature records reinforced the failure of LSC in the statistical verification tests (Table 2) and caused especially clear divergence between the low-frequency variability of observed and reconstructed climate in the case of LSC.
Noteworthily, the neglect of non-climatically induced autocorrelation in tree-rings -or any other type of proxy series -may lead to misinterpretations about the quality of the proxy.For example, Klingbjer and Moberg (2003) performed a dendroclimatic reconstruction of summer temperatures for Tornedalen (northern Sweden) using Scots pine tree-ring widths (Grudd et al., 2002) making use of only the concurrent year (t) as a single predictor for temperature.Expectedly, they obtained only a weak skill (R 2 =0.18) for their reconstruction.McCarroll et al. (2003) performed a multiproxy comparison in northern Finland using the annual production history of Scots pine quantified as earlywood, latewood and annual ring widths, earlywood, latewood and maximum densities, stable carbon isotope ratios, height growth, needle production and finally pollen deposition.Among these data, first order serial correlations as high as 0.45-0.66were found for some of the pine production histories; however, autocorrelation was not taken into account in evaluating the skills of different proxies to reconstruct temperatures (McCarroll et al., 2003).It could be hypothesized that the inclusion of the leading and/or lagging tree-ring values in the transfer functions could have resulted in palaeoclimatic models with markedly increased accuracy in both of these previous studies (Klingbjer and Moberg, 2003;McCarroll et al., 2003).

Calibration-verification process
Previous dendroclimatic studies have emphasized the particular danger of a greater degree of over-fit with ANN compared to linear models (Woodhouse, 1999;Zhang et al., 2000).As a consequence of over-fit, the model that performs well over the calibration period may not prove equally reliable over the verification interval.In this study, the reconstructed and observed climate showed slightly lower correlations over the verification than calibration period, but no clear difference in this regard could be observed for MLR vis-àvis ANN (Table 2).Universally, correlations are expected to decrease over the data withheld from calibration.With specific regard to dendroclimatology, the autocorrelation structure of tree-rings is known to vary slightly from time to time (Lindholm, 1996;Helama et al., 2002;Berninger et al., 2004).This variation may result in a slight degradation of correlation between observed and reconstructed climate variability, particularly over the verification period.
Previously, Woodhouse (1999) compared linear-and ANN-based transfer functions in reconstructing spring precipitation for the Colorado Front Range for the past 300 years.She found that although the ANN models explained more climatic variance over the calibration period, they did not perform as well over the withheld period as the linear model.The results indicated a potential danger of over-fitting by ANN (Woodhouse, 1999).By contrast, D 'Odorico et al. (2000) used the tree-ring dataset of Stahle et al. (1998), which originated along the Blackwater and Nottoway rivers in south-eastern Virginia, and obtained a considerably improved model with ANN compared to MLR for their reconstructions of the droughts.Ni et al. (2002) made use of MLR and ANN to reconstruct cool-season precipitation by treerings in Arizona and New Mexico over the past millennium.In their study, ANN models did not outperform MLR.However, ANN was found to better capture inter-annual and spatial variability and large-scale precipitation events at seasonal time-scales.In conclusion, two out of three previous studies found ANN notably advantageous over the linear alternative for reconstruction.Interestingly, the previous works have been based on moisture-sensitive tree-rings (Stahle et al., 1998;Woodhouse, 1999;D'Odorico et al., 2000;Ni et al., 2002) whereas our tree-rings were controlled by temperature variability.
The fact that the ANN models did not perform clearly better than MLR over the instrumental period (Table 2) could potentially be associated with the biogeographical origin of the proxy.Dendroclimatic material used here originated from high-latitude forest-limit areas with low growing season temperatures and harsh climate.Moreover, these are areas characterized by wide inter-tree spacing compared to forest interior or closed canopy environments (e.g., Veijola, 1998).Previous studies comparing pine growth at the polar forest limits and in more southern locations in Finland have shown that the dendroclimatic signal becomes noisier towards the southern areas where population dynamics are more likely to complicate the growth-climate relationships compared to northern forest-limit conditions (Lindholm et al., 2000;Helama et al., 2005a).The climate-proxy relationships may thus be simplified in the study region for biogeographical reasons.
The results of this study highlight the difficulty of choosing the transfer function model based simply on calibrationverification tests as there occurred only small differences in performance between most of the models (Tables 1 and  2).More precisely, the reconstructions by MLR and ANN clearly outperformed LSC, but models by MLR and ANN did not show clear differences over the instrumental climate period.Moreover, the algorithms providing the best fit between the observed and reconstructed climate did not in every case reproduce the climatic amplitudes as well as algorithms with poorer skill by R, RMSE, RE, CE and FDST statistics.In addition, there were several palaeoclimatic models correlating with climate with very high coefficients (Table 1).Parallel problems of choosing the best model were previously evidenced by Rutherford et al. (2005) for the hemispheric multi-proxy and tree-ring datasets.

Implications for palaeoclimatology
Artificial suppression of climatic amplitudes in the proxybased reconstructions could lead to a serious underestimation of the natural variability of the climate; this in turn may cause false detections of climate change with erroneously low uncertainty estimates in future climate predictions (Collins et al., 2002).In general, greater palaeoclimatic variations predict greater amplitudes of upcoming climatic changes (Osborn and Briffa, 2004).Our reconstructions covered the past millennium throughout the classical climate intervals of the Medieval Warm Period (MWP) and the Little Ice Age (LIA).Several studies have discussed the difficulties and uncertainties in characterizing climate variability over these intervals simply due to differences in palaeoclimatic transfer function models (Osborn and Briffa, 2004;von Storch et al., 2004;Esper et al., 2005;Rutherford et al., 2005).In a previous study, the climatic history of the study region was shown to carry imprints of the MWP and LIA (Helama et al., 2007b(Helama et al., , 2009)).Here we used the same tree-ring chronology as the previous work, but the current comparison included competing transfer function models, based on MLR, ANN and LSC, as well as a hybrid MLR.The present study thus provided the first possibility for model comparisons over these intervals in the region with particular focus on retrievable low-frequency variations in the reconstructed temperatures (Table 2, Fig. 2).
The multi-century temperature decline from the MWP into the LIA was reconstructed as a long-term change approximating 0.5 • C by conventional and hybrid MLR as well as with some of the ANN models, on a bicenturial scale.We could thus state that MLR reproduced the multi-centurial temperature amplitudes with no clear pitfalls with respect to ANN.While it is not possible to estimate the reliability of the reconstructed multi-centurial temperature variations by instrumental comparison, the fact that all MLR and ANN models underestimated the observed temperature variations over the instrumental period indicates that the reconstructed change of a half a centigrade degree from MWP to LIA would likely be an underestimation of the actual temperature change.
Performance of the different reconstructions during the volcanic signature year AD 1601 (Briffa et al., 1998) revealed clear differences between the MLR and ANN models (Table 2).This negative tree-ring anomaly was parallel with the previous estimates from the region and adjacent areas (Lindholm and Eronen, 2000;Gervais and MacDonald, 2001;Helama et al., 2005).In general, the reconstruction by conventional MLR implied slightly more anomalous temperatures than the hybrid MLR or ANN models.As a matter of fact, the year AD 1601 was reconstructed as 1.3 • C cooler than the coolest year within the calibration period (the mean temperature of July in AD 1903 was reconstructed as being 10.1 • C by MLR).While the comparison between the alternative reconstructions demonstrates the difficulty in estimating past temperatures that likely occurred outside the range of calibration data, it also points to the robustness of conventional and hybrid MLR models against ANN during periods of stochastic extremes.
The competitiveness of the MLR over the calibration, verification and reconstruction periods indicates that previous interpretations of palaeotemperatures in the region based on that method (e.g., Helama et al., 2007bHelama et al., , 2009) ) were not skewed despite the linearity of the utilized transfer function.As hypothesized above, the nonlinear algorithms may improve the performance of palaeoclimatic reconstructions in the presence of more complex proxy-climate relationships.Such a potential could be exploited for example in more southern regions where forest-interior tree-rings can be utilized for precipitation reconstructions (Helama and Lindholm, 2003).Other regions and proxies that should be examined include bivalve and coral sclerochronologies and fish otoliths, which are known to provide similar year-to-year proxy records of past climates but which grow in different trophic conditions and completely different types of ecosystems (Strom et al., 2004;Black et al., 2005;Helama et al., 2006Helama et al., , 2007a)).

Conclusions
Palaeotemperature reconstructions were derived from a dataset of subfossil and recent (living-tree) tree-ring widths from the beginning of the ninth century AD.The data originated from the forest-limit region of northernmost Finnish Lapland.Different approaches transforming the proxy variability into climatic signal were compared.The tree-ring chronology was identical in the case of each transfer function algorithm so that the results and the differences between the different reconstructions were subject to transformation technique alone.Different transfer functions included multiple linear regression (conventional and hybrid MLRs), linear scaling (LSC) and artificial neural networks (ANN).This was to compare the linear (MLR and LSC) and nonlinear (ANN) methods for palaeoclimatic reconstructions.
Conventional MLR performed competitively over both the calibration and verification periods.Moreover, MLR produced centurial to sub-millennial temperature variability with comparable and even slightly larger reconstructed climate amplitudes than the more sophisticated and evolved proxy-climate solutions by ANN.Moreover, the hybrid MLR reconstruction did not markedly improve the calibrationverification statistics.Similarly, this method did not provide larger temperature amplitudes over the instrumental or reconstruction periods.
The results emphasized the importance of using the lagging and leading proxy values in the transfer functions in order to adjust the autocorrelation structure of the original proxy record to better mimic the climatic autocorrelation structure.Without this adjustment, LSC failed over the calibration and especially during the verification period.Moreover, LSC provided an overestimate of the 19th century cooling.As a result, the verification statistics suggested rejecting this model.Any further interpretations based on this method could thus not be recommended in the case of the present sample.
Although the "real world" proxy-climate relationships could be expected to behave more or less nonlinearly, it was in particular the linear model (MLR) that provided the most robust reconstructions without significant loss of reconstruction skill.This may be due at least partly to the characteristics of the present tree-ring sample that originated from the harsh conditions of the high-latitude forest limits, which may have simplified the growth-climate relationships.Timberlines are the areas where the trees grow with relatively wide inter-tree space in comparison to the forest interior where supposedly more complex growth-climate associations could take place due to greater population dynamics.Consequently, the forest limits may be the regions over which the climate-proxy relationships may be simplified for natural reasons.Although the MLR proved adequate for climatic reconstruction in the present case, we hypothesize that the more sophisticated and adaptive techniques of ANN may well provide improved reconstruction skill in some other environment or proxy with more complex climate-proxy relationships.As such, our results emphasized the importance of examining the transfer function models separately for different regions, proxy types and time-scales.

Fig
Fig.1.Autocorrelation functions from first to tenth order for treerings, observed temperatures (mean July temperatures), and reconstructed temperatures using multiple linear regression (MLR), artificial neural networks (ANN1 in Table1) and linear scaling (LSC).Autocorrelations were computed using the common interval 1876-1998.

Fig. 2 .
Fig. 2. Comparison of the observed (JulyT, light grey line) and reconstructed temperature variability in the study region by multiple linear regression (MLR, thicker black line), different algorithms of artificial neural networks (ANN, thinner black lines) and linear scaling (LSC, dark grey line) over the calibration and verification periods.Temperature variability was compared separately for total range of variability (a) and at decadal and multi-decadal timescales (b).The latter time series were smoothed using 15-year cubic splines (with 50 percent cut-off).

Fig. 3 .
Fig. 3. Temperature reconstructions by multiple linear regression (MLR, thick grey line) and different algorithms of artificial neural networks (ANN, thin black lines) over the past millennium.Records spanned the long-term climatic reversals through the Medieval Warm Period (occurred between AD 930 and 1180) and the Little Ice Age (occurred between AD 1601 and 1850).Series were smoothed using 150-year cubic splines (with 50 percent cutoff) in order to portray the century-scale variability and the first and last 25 years were omitted to avoid spurious end-fits.The mean of interseries correlations for shown records was 0.95.Temperature reconstruction by scaling did not pass the verification tests and thus is not shown.

Table 3 .
Timing of the warmest and the coolest 25-year (a) and 250-year (b) periods as reconstructed by linear regression (MLR), different algorithms of artificial neural networks (ANN) and linear scaling (LSC).