Reconstruction of geomagnetic activity and near-Earth interplanetary conditions over the past 167 yr – Part 2: A new reconstruction of the interplanetary magnetic field

. We present a new reconstruction of the interplanetary magnetic ﬁeld (IMF, B ) for 1846–2012 with a full analysis of errors, based on the homogeneously constructed IDV(1d) composite of geomagnetic activity presented in Part 1 (Lockwood et al., 2013a). Analysis of the dependence of the commonly used geomagnetic indices on solar wind parameters is presented which helps explain why annual means of interdiurnal range data, such as the new composite, depend only on the IMF with only a very weak inﬂuence of the solar wind ﬂow speed. The best results are obtained using a polynomial (rather than a linear) ﬁt of the form B = χ · ( IDV ( 1d ) − β) α with best-ﬁt coefﬁcients χ = 3 . 469, β = 1 . 393 nT, and α = 0 . 420. The results are contrasted with the reconstruction of the IMF since 1835 by Svalgaard and Cliver (2010).


Introduction
This paper is the second of a series of three. As discussed in Part 1 (Lockwood et al., 2013a), for many years, the only available long-term record of geomagnetic activity was the aa index, compiled for 1868-1968 by Mayaud (1971Mayaud ( , 1972Mayaud ( , 1980 and subsequently continued to the present day. It was also extended back to 1846 by Nevanlinna and Kataja (1993). The aa index shows long-term changes that cannot be attributed to site changes, nor to the intercalibration of stations, nor to changes in the sensitivity of the stations in both hemi-spheres (Cliver et al., 1998;Lockwood, 2003;Love, 2011). Stamper et al. (1999) analysed all the potential factors that could have induced the solar cycle variations and long-term drift in the aa index since the start of the space age and concluded that the only viable explanation was variation in near-Earth interplanetary space caused by changes in the solar corona.
Because the aa index derived during the space age correlates with both the southward component, B z (in a frame aligned to Earth's magnetic field) of the near-Earth interplanetary magnetic field (IMF), B, and the solar wind speed, V , observed by interplanetary spacecraft, Feynman and Crooker (1978) noted that the upward drift in the aa index over the first half of the 20th century reveals that averages of either B z or V had probably changed, or both. The first paper to separate the influences of B z and V on the aa index was by Cliver et al. (1998) who used a combination of the aa index and sunspot number data on 11 yr timescales. Although it is the southward component of the IMF, B z , that drives geomagnetic activity, on annual timescales the IMF orientation effect averages to an almost constant factor , such that the variations are controlled primarily by B. The uncertainty that this introduces into reconstructions of B will be discussed later in this paper. Separation on B and V on hourly timescales using only the aa index was achieved by Lockwood et al. (1999) who employed the recurrence index of Sargent III (1986) (derived from aa) to remove the effect of solar wind speed. This can be done because recurrent fast streams in the ecliptic plane (emanating from either isolated low-latitude coronal holes or low-latitude extensions to polar coronal holes) elevate the annual mean V but also increase the 27 d recurrence in geomagnetic activity by generating co-rotating interaction regions (CIRs) on their leading edge. Lockwood et al. (1999) used the aa index to reconstruct the signed "open solar flux" (or "coronal source flux"), F S , which is the total magnetic flux of one polarity leaving the top of the solar corona and entering the heliosphere. Svalgaard and Cliver (2006) have argued that the recurrence index method ceased working after 2003 but this argument has been shown to be incorrect because the methods continue to accurately reproduce both the observed near-Earth IMF and open solar flux up to 2012 Lockwood, 2013).
It is important to compare like-with-like when discussing centennial reconstructions of coronal and heliospheric magnetic fields and, in particular, to recognise that the open solar flux is not linearly related to the near-Earth IMF (Lockwood et al., 2009a). A linear fit of observed F S (derived from either in situ magnetometers from the radial component of the IMF, B r , or from remote-sensing solar magnetograms) against B generates an intercept such that the open solar flux falls to zero at a non-zero value of B. This has, we believe, sometimes been confused with a "floor" value of the near-Earth IMF B Cliver, 2007, 2010) (hereafter SC07 and SC10,respectively). However, this is inherently illogical as no source for the near-Earth IMF other than the coronal source flux has ever been suggested. This being the case, when F S falls to zero B must also fall to zero and, given the values observed during the space age, we therefore know that B and F S cannot be linearly related. As pointed out by Lockwood et al. (2009a) the required non-linearity between B and F S is very well explained by a small rise in average solar wind speed V with increasing average B which, as predicted by the Parker spiral theory of the IMF (Parker, 1958(Parker, , 1963, will cause the heliospheric field spiral to unwind (i.e. B to fall for a given F S ), thereby increasing the ratio F S /B. Using combinations of geomagnetic indices that respond differently to B and V , Rouillard et al. (2007) showed that the rise in average B over the 20th century was indeed associated by a weak rise in average V . When the same parameters are compared there is considerable agreement between the various reconstructions of the IMF , the main differences being for early years when data sources are sparse and the sequences are more susceptible to errors in the data from one site and how they have been processed.
As discussed in Part 1, several new geomagnetic indices have been produced from hourly means of data or from hourly samples ("spot values") which were often recorded in observatory yearbooks. Three examples of this are the median index m, as implemented by Lockwood et al. (2006b) and used by Rouillard et al. (2007); the inter-hour variability (IHV) index designed by Svalgaard and Cliver (2007); and the interdiurnal variation (IDV) index introduced by Sval-gaard and Cliver (2005) (hereafter SC05), and developed by Svalgaard and Cliver (2010) (hereafter SC10). These indices, and IDV in particular, have opened up the application of many historic data such that the aa index is no longer the only centennial record of geomagnetic activity. Comparisons of IDV from different stations show a considerable degree of agreement, increasingly so after about 1910 as instrumentation improved and clean and stable magnetic sites were found and protected. IDV was therefore a major and important new index which is usefully simple to construct. However, Part 1 highlights some limitations in the way IDV was constructed. Firstly SC05 and SC10 moved from Bartels' original formulation, which was based on the difference between daily means of the horizontal field component H , to using only the near-midnight value. The advantage of this is that it can reduce the effects of variations in the regular diurnal variation of H . However this advantage is outweighed by the loss of suppression of instrumental and geophysical noise achieved by averaging 24 samples together and also by introducing a strong influence of the auroral electroject of the substorm current wedge (Lockwood, 2013). In addition, there are variations with station location of the dependence of IDV on the solar wind velocity. We refer to IDV as not being "homogeneously constructed" because it employs a mix of stations with different responses and that mix changes with time as the spatial distribution of stations available changes. This means the index response derived in the space age will be different to that in earlier years because it is not compiled using the same mix of stations. Similarly the m index was not homogeneously constructed. The derivations of m and IDV are discussed in Part 1.
The annual means of the version of IDV generated by SC05 for the space age correlate strongly with IMF B (correlation coefficient, r = 0.862, r 2 = 0.743, significance S = 99.37 %) and only very weakly with V (r = 0.094, r 2 = 0.009, S = 55 %). (Note in this paper, the significance, S, of each correlation is computed using the AR1 noise model, i.e. allowing for the autocorrelation functions of the time series at a lag of one time step, and also allowing for the degrees of freedom of the fit.) The version of IDV generated by SC10 uses more stations and raises this quoted correlation coefficient with B to r = 0.932 (r 2 = 0.869, S = 99.84 %). In Sect. 2 we show that the variations in IMF orientation in annual mean data limits the maximum possible correlation to 0.957 and hence the percentage of the variation of B that is not reproduced by the IDV of SC10 is 100(1 − 0.932 2 ) = 13.1 % and of this 100(1−0.957 2 ) = 8.4 % is due to the IMF orientation factor. In Sect. 2 below we confirm that IDV correlates with B only (i.e. with a negligible dependence on V ) by correlating with with BV n for a range of n and showing that the peak correlation is at an n value which is not significantly different from zero. This means that the separation of the effects of B and V required for range indices such as the aa index (and achieved by Lockwood et al., 1999 Linear correlation coefficients of annual means of various geomagnetic indices with BV n , as a function of n, the exponent of the solar wind speed, V (B is the IMF field strength). The primes denote the fact that data have been omitted in calculating either set of annual means if any of V , B or the geomagnetic index are missing because of data gaps exceeding 1 h duration. Values of BV n are computed hourly and then averaged. Table 2 gives the peak correlation r p , the n value giving peak correlation n p , and the significance of the peak correlation, S p . Correlograms are shown for AL (red line), AU (green line), −Dst (blue solid line), the negative part of Dst, −Dst1 (where Dst1 is the same as Dst but intervals when Dst > 0 are treated as data gaps) (blue dashed line), aa (cyan), Ap (orange), IDV (black dashed), Am (mauve), m (yellow), IHV (red dashed), and the IDV(1d) index described in Part 1 (black solid line). For indices which are increasingly negative for increasing activity (Dst, Dst1 and AL) the index has been multiplied by −1. a way of directly determining B which can be readily applied to a great deal of recorded historic data.
Part 1 discusses how the quality of both IDV and m is naturally lower in early years when fewer stations contribute. This contrasts with the philosophy adopted by Mayaud (1971Mayaud ( , 1972Mayaud ( , 1980 in generating the aa index, which was to use a constant data type throughout (i.e. from one Northern and one Southern Hemisphere station at all times) and so generate an homogeneous sequence. In Part 1 an new homogeneous composite index of interdiurnal variability, termed IDV(1d), was generated that extends from 1845 to the present day. In Sect. 2 of the present paper we compare the response of various geomagnetic indices to interplanetary parameters and show that, like IDV, the new IDV(1d) composite depends primarily on the IMF B and can be used to reconstruct the IMF back to 1845. Section 3 presents the first full analysis of uncertainties in IMF reconstruction using a Monte Carlo technique. In Sect. 4, the results of this reconstruction are contrasted with that by SC10 using IDV, revealing that the results are very similar except for the early data. The conclusions and implications are discussed in Sect. 5.

The dependencies of different geomagnetic indices on solar wind parameters
Because the Biot-Savart law contains an inverse-square dependence on the distance between the moving charges and the point in question, the effects of closer currents dominate over more distant ones but all contribute. As a result, although the deflections seen by ground-based magnetometers usually reflect changes in the closer large-scale currents in the magnetosphere-ionosphere system (for example, the auroral electroject for nightside stations in the auroral oval or the ring current for equatorial stations), there will also be some effects of other currents flowing elsewhere. Thus it is not, in general, accurate to ascribe geomagnetic activity seen at any one station, or combination of stations, to one current system alone. One well-recognised example of this is the Dst index which shows large, long-lived negative deflections in storms caused by ring current enhancements but also shows positive deflections associated with magnetospheric compression by CME (coronal mass ejection) impacts, which reveal it also responds to the Chapman-Ferarro currents flowing in the magnetopause.
The above discussion means that we should expect different geomagnetic indices to show different dependencies on solar wind parameters because they are influenced by different combinations of current systems and these current systems respond differently to changes in the solar wind forcing. Figure 1 studies the correlations between all the commonly used geomagnetic indices and BV n where B is the IMF, V is the solar wind speed and n is an exponent that is here varied between −2 and 4. The correlations are for annual means between 1966 and 2012, inclusive. The parameters are marked with a prime to denote the fact that in each case data have been omitted in computing both sets of annual means if any of the simultaneous (allowing for the predicted solar wind propagation lag) hourly means of B, V or the geomagnetic index are missing due to a data gap. In the case of the 3 h range indices such as aa, Am and Ap, the procedure adopted by Finch and  is followed to ensure only simultaneous geomagnetic and IMF data are included in the annual means. In the case of IDV(1d), each daily value contains information on H from two whole days: in order to be included in the annual means B and IDV(1d) , we here require that there be 75 % coverage of the IMF observations over those two days. The value of 75 % is chosen as a compromise between not eliminating too much of the data and removing data for which the IMF means could be misleading because the data coverage is low. The effects of not carrying out this piecewise removal of data from both sets during data gaps were studied by Finch and : effectively, one is assuming that annual means are representative, even when large fractions of the data are missing (as they are in some years for the IMF data). We here only employ annual means that have data availability exceeding 50 %. In the study presented in Fig. 1, all the correlations are somewhat improved by taking these steps and, importantly, the n of peak correlation, n p , is sometimes also affected. Table 1 gives n p , the peak correlation r p , the fraction of the variation explained by the peak correlation r 2 p , and the significance of the peak correlation S p , for each case. Note that we only have annual mean data for IDV and the way m is constructed only yields annual values so no allowance for gaps in the interplanetary data can be made in these cases (hence there is no prime symbol attached to IDV or m in Fig. 1). The coupling functions BV n have been calculated in hourly data and then averaged, i.e. <BV n > 1 yr is used rather than <B> 1 yr (<V > 1 yr ) n . The AL auroral electrojet index (red line) shows peak correlation at n = n p = 2, i.e. it has a BV 2 dependence. The AU index (green line) gives a peak at n p = 1.1 (i.e. it has close to a BV dependence and hence varies with the interplanetary electric field). The Dst index shows a peak at n p = 0.4 (blue line) but some of this dependence on V arises from the compression of the equatorial field by solar wind dynamic pressure: if we use Dst1 (which is the same as Dst but treats all intervals where Dst > 0 as data gaps and so only contains intervals when Dst is dominated by ring current effects), we get the dashed blue line with a higher correlation coefficient peak at n p = 0.1. This peak is flat and hence the peak n is not significantly different from zero (i.e. a dependence on B alone). The cyan line is for the aa index and peaks at n p = 1.9 (very close to the BV 2 dependence of AL) and the orange dashed line is for Ap and peaks at n p = 1.6. The black line is for the IDV(1d) index which is described in detail in Part 1 and which peaks at n p = −0.1. Hence IDV(1d), like Dst1, is not significantly different from having a dependence on B alone. Thus, this agrees with the assertion by SC10 that the negative part of Dst (i.e. ring current enhancement) is closest to explaining the behaviour of the interdiurnal variability indices on these annual timescales. The range indices, aa, Am and Ap respond in a manner similar to the high-latitude ionospheric currents controlling AL and AU, as does the IHV index.
An important insight into these differences in the peak correlation "coupling functions" comes from the work of Finch et al. (2008), who showed that a V 2 dependence is introduced into the variability of hourly mean magnetometer data only within the auroral oval on the nightside -i.e. it is associated specifically with the auroral electrojet and the substorm current wedge. Consideration of the effect of solar wind dynamic pressure on the geomagnetic tail shows that this is to be expected because the current in the near-Earth crosstail current sheet (that is deflected into the auroral electrojet during substorm expansion phases in the substorm current wedge) depends on V 2 (see review by Lockwood, 2013). Lastly we note that Rouillard et al. (2007) and Lockwood et al. (2009a) find the m index correlates with BV 0.3 , despite it being a variant of an interdiurnal variation index. This appears to be because it uses a larger fraction of high latitude stations which, as shown by Finch et al. (2008) introduces some dependence on V .
The larger differences in the responses of the various geomagnetic indices to mean solar wind speed shown in Fig. 1 have been shown to be statistically significant by Lockwood et al. (2009a) and Lockwood (2013). They are extremely useful as they mean that both B and V can be derived using different combinations of the indices (Svalgaard and Cliver, 2007;Rouillard et al., 2007;Lockwood et al., 2009). However, it should be noted that the correlograms in Fig. 1, particularly those with larger n p , are somewhat flat-topped and so some differences in n p are not statistically significant. For example, the difference in the peaks for AL and AU do not have great statistical significance and so cannot be used to make robust reconstructions. The difference that has been exploited most is between interdiurnal variation indices (with n p ≈ 0) and range indices influenced by the substorm current wedge (with n p ≈ 2). Lockwood et al. (2009a) have demonstrated that the chance that this distinction is not real is low: but even in this case we can only reject the null hypothesis that the two actually share the same response to V at the 87 % confidence level.

Analysis of uncertainties and regression fits
All the geomagnetic indices shown in Fig. 1 result from solar wind-magnetosphere coupling which is dominated by the process of magnetic reconnection. As a result, all show a strong dependence on the southward IMF in the geocentric solar magnetospheric (GSM) reference frame. Many coupling functions have been used to quantify the solar wind energy, mass or momentum transfer into the magnetosphere and several use a term Bsin 4 (θ/2), where B is the IMF field magnitude and θ is the IMF clock angle in the GSM frame (see review by Finch and . Figure 2a shows a scatter plot of Bsin 4 (θ/2) as a function of B for all hourly means of IMF data for 1966-2012 (inclusive), using the Omni-2 data set obtained from NASA's Omniweb service (http://omniweb.gsfc.nasa.gov/). There is a great deal of scatter because of the IMF orientation factor, sin 4 (θ/2), which varies between 0 for northward IMF in the GSM frame (giving the points on the x axis) and 1 for a purely southward field in GSM (giving the points on the line of slope 1). Taking annual means of these data we get Fig. 2b. Here the IMF orientation factor has averaged out to an almost constant factor and the annual means of Bsin 4 (θ/2) are approximately proportional to B. The need to average out the sin 4 (θ/2) factor is why reconstructions of interplanetary conditions from geomagnetic activity must be restricted to annual or longer timescales. We note a small discontinuity in <Bsin 4 (θ/2)> 1 yr at around 1.75 nT (<B> 1 yr ≈ 5 nT) in Fig. 2. There is no known reason why this might occur and the number of annual mean samples in Fig. 2b is quite low. Using Student's t test we find no evidence that it is statistically significant and so we use a single linear regression on the whole data set. If the discontinuity were to persist in future data, it would become statistically significant and a polynomial fit, or separate linear regressions for above and below <B> 1 yr = 5 nT, may become more appropriate than the single linear fit used to date. Figure 2b shows that the orientation factor is not quite constant, even on annual averaging timescales, and this gives some scatter which is the principle uncertainty in any reconstruction of B from any geomagnetic data. The linear correlation between <Bsin 4 (θ/2)> 1 yr and <B> 1 yr is 0.957 and this sets an upper limit to the correlation that can be obtained between annual means of the IMF and any geomagnetic index. If a geomagnetic index responds to Bsin 4 (θ/2), in order to derive the IMF B, we are effectively multiplying the index by the ratio f = B/[Bsin 4 (θ/2)] (either explicitly or implicitly). The distribution of this ratio for annual mean data is shown by the histogram in Fig. 3. Here, as elsewhere in this paper, we have required that data coverage exceed 50 % before an annual mean is considered valid. The mean value of the factor f is <f > = 3.251 and its standard deviation is σ f = 0.369. Hence multiplying by f is introducing an uncertainty σ f /<f > (i.e. just over 11 % at the 1σ level even in annual means). The dashed line shows the best-fit Gaussian distribution of the same mean <f > and standard deviation σ f as the observed distribution of f . In addition, there will be some (small) measurement uncertainty in B. Analysis by King and Papitashvili (2011) used comparisons between data taken by different craft to place a maximum systematic error of 0.2 nT on the measurements of B, we here take ±0.2 nT to be the 2σ points of that error distribution.
We also need to consider the uncertainties in the IDV(1d) index, which at this time, comes from the Eskdalemuir station (see Part 1). Figure 4 shows the distribution of the fit residuals for 1966-2012 derived from fitting SC10's IDV index to IDV(1d) (see Figs. 12 and 13 of Part 1). Because IDV is generated from a large number of stations at this time (> 50), we here assume that all of the fit residuals arise from errors in the IDV(1d) data and so we take the distribution in Fig. 4 to give an estimate of the uncertainty in the IDV(1d) values. Figure 5 shows a scatter plot of annual means of the IDV(1d) values against annual means of the IMF B . The primes denote the fact that, as in Fig. 1, allowance has been made for data gaps in the IMF data sequence by removing both the IMF data and simultaneous geomagnetic data when the other is missing. The linear correlation of the annual means over the interval 1966-2013 between B and IDV(1d) is r = 0.933 (explaining r 2 = 0.871 of the variation and with significance, S = 99.98 %) whereas between B and IDV(1d) (without allowance form data gaps) it is 0.914 (explaining r 2 = 0.835 of the variation, and with S = 99.91 %). Hence the piecewise removal of data corresponding to gaps in the interplanetary data series has made a (small) improvement in this case, raising r 2 and explaining an additional 3.6 % of the variation in B. However, note that only data from 1966 onwards were used because before then the interplanetary data  The effect of the drift in the magnetic latitudes due to the secular change in the geomagnetic field has also been accounted for. The primes denote that before taking annual means, both IDV(1d) and B data have been removed if there is less than 75 % coverage of IMF monitoring during the relevant 2 d IDV(1d) interval and annual means are only considered if there is at least 50 % data coverage in a year. The red line is the best linear fit and grey area defines the 2σ uncertainty band on this fit, derived using a Monte Carlo technique, allowing for the uncertainties introduced by the IMF orientation factor sin 4 (θ/2) and the experimental uncertainties in both the IMF and IDV(1d) index. The linear correlation coefficient of fitted and observed parameters is r = 0.933 and 10 of the 32 values (31 %) lie outside the predicted uncertainty limits. sion fit is very small and the residual and Q-Q plots show the residuals to be homoskedastic, drift-free and normally distributed (see Part 1 and Lockwood et al., 2006a). Passing these tests is necessary to determine if a regression fit is valid. However, we note a slight non-linearity in the data, particularly in the lowest values which were observed during the last solar minimum, which was the lowest and longest seen during the space age (Lockwood, 2010). It is instructive to consider the implications of a linear fit given that it predicts that geomagnetic activity, as quantified by IDV(1d), will fall to zero if the IMF B falls through a threshold (in annual means) of about 3 nT. We know of no theoretical reason why this should be the case in that the magnetic reconnection between the geomagnetic field and the IMF (which drives geomagnetic activity and is ultimately responsible for the observed correlation between B and IDV(1d) on annual timescales) is not predicted to cease below any such threshold value. However, we would expect that IDV(1d) might asymptotically approach a small base-level value (set by phenomena such as solar wind buffeting or Kelvin-Helmholtz waves) if the IMF fell towards zero (when the magnetic reconnection that gives solar wind energy and momentum access to the magnetosphere would cease). We here fit a polynomial, constrained to pass through a base-level IDV(1d) activity level β when B = 0. This therefore has the form B = χ · (IDV(1d) − β) α . (1) We then used the Nelder-Mead search method to find the best-fit coefficients χ, β, and α. Using this polynomial form and with piecewise removal of data during data gaps (i.e. using IDV(1d) ), the correlation between observed and bestfit B values predicted from IDV(1d) is raised to r p = 0.947 (r 2 = 0.896, S = 96.88 %), meaning that 90 % of the variation in annual means of IMF B is predicted by the simultaneous IDV(1d) (and of the remaining 10 %, 8.1 % is attributable to by the IMF orientation factor f ). The best fit to the data has coefficients χ = 3.469, β = 1.393 nT, and α = 0.420. Note that the improved fit is achieved by using more fit parameters which increases the degrees of freedom by three and so the significance S is decreased (such that whereas there is only a 0.1 % chance that the linear fit is a chance occurrence this is raised to a 3 % chance for the polynomial fit). Comparison of Figs. 5 and 6 shows that over the range of B and IDV(1d) seen during the space age, the difference between the linear and polynomial fits is very small. In fact, the recent low minimum between cycles 23 and 24 means that the range of the composite IDV(1d) since 1845 is now only marginally greater than that seen during the space age and hence the difference between the polynomial and linear fits is not significant. However, there is no published physical quantification that shows this range cannot be exceeded at either end and hence we here adopt the polynomial fit as it gives the higher correlation and would not generate a discontinuity if annual mean B fell below 3 nT.
The grey bands in Figs. 5 and 6 show the uncertainty ranges of these fits at the 2σ level derived by a standard Monte Carlo technique. The fits are carried out a very large number of times, N (in fact N = 100 000 were used), and each time each of the IDV(1d) values are perturbed by an individual error which is randomly selected from the normal distribution shown in Fig. 4. The B values are similarly perturbed by a factor to allow for the variability in the factor f = B/[Bsin 4 (θ/2)], these are selected by a random number generator to obey the distribution shown in Fig. 3. The B values are also perturbed by a small second error randomly selected from a normal distribution of standard deviation σ = 0.1 nT to account for observational error in B. For each IDV(1d) value the distribution of the N fitted B values is investigated and the 5 and 95 percentiles (the 2σ points) of that distribution are used to define the uncertainty band in fitted B for a given IDV(1d). These uncertainty bands are shaded grey. The red and blue lines in Figs. 5 and 6, respectively, are the medians of all N fits made with random errors introduced. Note in Fig. 6 that this yields a slightly lower base-level IDV(1d) of about β = 0.7 nT in the case of the polynomial fit. The best-fit base-level value is caused by (1). The blue line is the best polynomial fit and grey area defines the 2σ uncertainty band on this fit, derived using a Monte Carlo technique, allowing for the uncertainties introduced by the IMF orientation factor sin 4 (θ/2) and the experimental uncertainties in both the IMF and IDV(1d) index. The linear correlation coefficient of fitted and observed parameters is r = 0.974. The error bars are based on the data coverage in the year and are taken from a Monte Carlo study in which the effect of data gaps was quantified by inserting them at random into the continuous IMF data available after 1996. what has been termed in the past "viscous-like" solar windmagnetosphere interactions.
The uncertainty band grows very large at the lowest IDV(1d), especially for the non-linear fit. This is because the N fits diverge considerably around this value given it is not well constrained by the data. However, this is not a great concern here as 2009, within the fit period, is the lowest IDV(1d) value in the composite (since 1845). However, it does mean that even if good proxy data for IDV(1d) became available for the Dalton or Maunder minima, then the associated uncertainty in any reconstructed B would be large. Similarly, the spread of uncertainties grows somewhat larger at IDV(1d) values larger than seen during the space age. This is relevant because such larger IDV(1d) values were observed during cycle 19, before the start of the space age. Uncertainties in the reconstructed IMF at the peak of cycle 19 are larger as a result.
In theory, if we have accounted for all the uncertainties correctly, then only 10 % (5 % in both tails of the distribution) of all the data points in Figs. 5 and 6 should lie outside of the 2σ values delineated by the grey areas. This is not the case. For the linear fit, this is true for 10 out of the 32 valid annual means (that meet the 50 % data coverage criterion), which is 32 %. For the non-linear fit, this falls to 7 of the 32 (22 %). This tells us that there is a least one other factor influencing the fit other than those we have accounted for (which are the IMF, its measurement error, the IMF orientation factor and the experimental uncertainty in the IDV(1d) index). There are a number of possibilities: large differences between the IMF impacting the spacecraft and the Earth, influence of other solar wind parameters, solar EUV and/or thermospheric winds and densities modulating ionospheric conductivities and currents, measurement errors in IDV (remember the errors in IDV(1d) were computed assuming IDV is error-free) and data gaps in the interplanetary data. Although the effect of the last of these has been minimised by piecewise removal of geomagnetic data during such gaps, thereby ensuring only simultaneous data are used in annual means, there is still a subtle problem because the effect of a data gap depends on what time of year (and, in case of short gaps, time of day) it occurs at. There are (semi)annual and UT variations in the geomagnetic activity response to a given set of interplanetary conditions due to the effects of Earth's dipole tilt. If we have full data coverage, these variations are not a factor as they are averaged out in annual means. However, data gaps mean they will have an effect, depending on the UT and time-of-year at which those data gaps occur. To simulate this, the ratio of annual means of B and Bp (the predicted value from IDV(1d) using the best-fit polynomial regression given by Eq. 1) was evaluated for the continuous interplanetary data after 1995 but with data gaps synthetically introduced at random into both data sets in such a way as to reproduce the observed distribution of gap durations in the OMNI2 data set. Repeating this many (100 000) times over allows statistical evaluation of the uncertainty in Bp caused by gaps in the IMF data, as a function of the total data coverage. Using the observed coverage, uncertainties due to this effect can be assigned to annual means and these are shown by the error bars in Fig. 6. Allowing for these error bars, 27 of the 30 (90 %) are consistent with the grey band and this meets the 2σ design criterion.
Lastly, we note that we did not retain any data to act as an independent test of our regression. This was because we did not want to exclude recent data in deriving the fit since it includes the recent low minimum which extends the range of the correlated data considerably, nor did we want to use the early interplanetary data as a test set because it has greater uncertainties and more data gaps. Tests of the regression will be available in a few years' time as new data accrue. We note that there are reasons to expect the current decline in solar activity to continue (Barnard et al., 2011), such that the next solar minimum may provide a test of data that is outside the range covered by the present data.

Reconstruction of the IMF since 1845
These fits can then be used with the IDV(1d) composite to reconstruct the IMF B and compute the uncertainty in that reconstruction. The fact that IDV(1d) is a homogeneously constructed index (unlike IDV or m) is very important be-cause the regression uncertainty analysis presented above can reasonably be applied to the past data as well as the space-age data. This is not true if errors have changed because the type and quality of the geomagnetic data used have changed. Because it yields a higher correlation, we concentrate on the polynomial fit, but we also briefly show that results are very similar for the linear fit. The best-fit reconstruction is obtained by applying the best-fit polynomial line shown in Fig. 6 with the best composite in annual means of IDV(1d) shown in black in Fig. 11 of Part 1 (which allows for the drift in geomagnetic latitude of the stations). Minima/maxima are obtained by applying the lower/upper limit of the uncertainty band of the fit shown in Fig. 6 to, respectively, the upper/lower composite limits shown in Figs. 13 and 14 of Part 1. This means that the uncertainty band for the reconstructed B contains both the uncertainty arising from the fit of IDV(1d) with B and the uncertainty associated with joining the data from the three magnetometer stations into a single composite (as carried out in Part 1). Note these uncertainties do not include those associated with any erroneous drifts in the IDV(1d) values from the stations used in the geomagnetic activity composite. However, Part 1 shows that comparison with other stations (STP, NER and BAR) indicates very low errors in the earliest data (1850-1862) and the very high agreement with the many stations contributing to IDV after 1880 shows even lower errors. Hence the most uncertain part of the reconstruction is 1863-1879 when corroborating evidence is very sparse and usually of lower quality than the Helsinki magnetometer data used in the composite.
The best-fit reconstruction of B using the polynomial fit is the black line shown in Fig. 7. The grey area surrounding this black line is the 2σ uncertainty band associated with this, as discussed above. Table 2 gives the annual mean values of the composite IDV(1d) and reconstructed B and their uncertainties. The red line in Fig. 7 shows the result of using the linear fit, and as expected, it is very similar to the results of the polynomial fit for the observed range of IDV(1d) over the interval of the reconstruction. The green line shows the SC10 reconstruction. The blue dots show the annual means of the IMF observations. Note that these lie outside of the uncertainty band of the fitted IDV(1d) more often than in Fig. 6: this is because all annual means are shown in Fig. 7, irrespective of whether they meet the 50 % data coverage requirement or not. The peak correlation for the polynomial fit is r = 0.947 (r 2 = 0.896), meaning that just 10 % of the variation in annual means of IMF B is not reproduced by the simultaneous IDV(1d) value (and 8.1 % of this is the unavoidable uncertainty introduced by the IMF orientation factor f ).

Discussion and conclusions
We have used the new composite of geomagnetic activity described in Part 1 to reconstruct the interplanetary magnetic field variation between 1845 and 2012. By piecewise removal Ann. Geophys., 31, 1979-1992 of data from one data series corresponding to gaps in the other, by allowing for the secular change in the geomagnetic latitude of the stations used, and by using a polynomial fit, we can reproduce 90 % of the variation in annual means of IMF B using the simultaneous IDV(1d) data; of the unexplained part, 8.4 % is due to the unavoidable error associated with the IMF orientation factor. Thus the fit is very close to being as good as it can theoretically be. The above compares to the 71 % obtained by SC05 for a linear fit to their version of IDV and the 84 % obtained by SC10 for their version of IDV that employs more stations. Thus using the IDV(1d) composite we can reconstruct the IMF variation back to 1846 with some considerable confidence. Because the composite was constructed as homogeneously as possible, we can carry out a full uncertainty analysis that apples to historic data as well as modern data. Uncertainties (allowing for both the fit uncertainties and the effect of joining data series to generate the composite) have been estimated using a Monte Carlo technique. Uncertainties given are at the 2σ level.
The reconstructed IMF is in excellent agreement with that derived from IDV by SC10 after 1880. This is not surprising as Part 1 shows that IDV(1d) and IDV are very similar indeed over this interval. However before 1880 there are considerable differences and we here find smaller amplitude cycles and lower mean values than SC10. As discussed in Part 1, the Russian stations give strong support to the IDV(1d) composite, and hence the B reconstruction presented here, for   In reviewing this paper, L. Svalgaard (referee communication, 2013) has argued that the Helsinki H values, and hence IDV(1d), are too low by about 30 % between 1866 and 1873 (covering the peak of solar cycle 11) and argues that this should be corrected. This raises a very important discussion point. The argument for a correction is based on comparing the range of the mean diurnal variations with the group sunspot number. We strongly disagree, as a matter of principle, that group sunspot number should be used to correct the geomagnetic data in this way, for it will cause the geomagnetic data to lose independence from the sunspot data and the former will inevitably follow the latter. Indeed we are extremely concerned that this sort of correction may have been applied in the past to other geomagnetic data because, although the correlation between average diurnal range and group sunspot number is high (typically 0.9 for stations at the latitudes used to generate the IDV(1d) index), we note that agreement can be persistently poor (differences exceeding 50 %) in some solar cycles. In terms of this specific example, one has to ask what could have changed in 1866 and then (exactly) reversed in 1873, such that the error was only present between these two dates? The Helsinki data were compiled using the same instrumentation and procedures throughout. The same error would have reduced the range Ak(H ) variation derived by Nevanlinna (2004) and Fig. 1 of that paper shows that although it is a few percent lower than the linearly regressed aa index in these years the difference is roughly the same in magnitude as other years. As shown in Part 1, IDV(1d) is considerably smaller than the Bartels u index in this interval but Bartels himself considered u to be unsatisfactory before 1872. Physically, Svalgaard argues that such a correction is valid because the average diurnal variation range is set by thermospheric winds driven by EUV heat-ing and that EUV mainly emanates from active regions and hence the range of the diurnal variation depends only on the group sunspot number. However, thermospheric winds, globally, are also modulated by auroral energy deposition (which has a UT variation, for example caused by dipole tilt effects, introducing diurnal variation at a given location) and for the higher latitude stations used in IDV(1d) (chosen to keep n in BV n near zero) there is an influence of auroral currents which will also have a diurnal variation. This suggests the sunspot number or sunspot group number is very unlikely to be an adequate metric on which to base calibration. To test this out we have derived the range of the average diurnal variation of H values from Nurmijärvi (close to the Helsinki site) and compared them to sunspot number and group sunspot number derived from the Greenwich/USAF data. The correlations are very good (near 0.9) but there are considerable deviations. Specifically, for solar cycles 19 and 23, agreement is almost perfect, for 21 and 22 it is very good but for cycle 20 it is very poor (the regressed sunspot number is over 50 % higher than the observed average diurnal range). The same test applied to Eskdalemuir generated almost identical results. Using the international sunspot number instead of the group number, the difference for cycle 20 is not as great but deviations persisting over a sunspot maximum of up to 25 % are found. We conclude that sunspot numbers should not be used to correct or evaluate the geomagnetic data (because corrections can be in error and because the data sets would no longer be independent). We therefore do not apply any correction to the Helsinki data, either here or in Part 1. The only valid tests of IDV(1d) in solar cycle 11 (and hence of the IMF constructed from it) would come from IDV(1d) values from stations nearby to Helsinki (such that they have the same response to solar wind and IMF parameters). Tests using proxy geomagnetic data (such as diurnal range) should be avoided because, as for the group sunspot data, there are considerable uncertainties.
The range of annual means of the best-fit reconstructed B during the full interval (1845-2012) is 4.21-9.34 nT (allowing for the estimated uncertainties this could be 3.34-10.72 nT) whereas the range observed from in situ data during the space age (here meaning 1966-2012) is 3.90-9.20 nT. The higher maximum in the reconstructed B is because the space age data do not cover the most active sunspot cycle known to us, which is cycle 19. The minimum is set for both intervals by the year 2009 during the "exceptional" low minimum between cycles 23 and 24 (Lockwood, 2010). Figure 7 shows that the difference between the linear and polynomial fits is not great over the interval of the reconstruction and the linear fit always remains within the uncertainty range of the polynomial fit. An important point about these ranges of variation is that the reconstructed IMF does not extend back in to the Dalton minimum in sunspot activity (ca. 1790-1820), let alone the deeper Maunder minimum (ca. 1650-1700). The relationship between IMF and sunspot number is a complex one: the peak correlation for space age data is for R 0.1 and this gives a peak correlation of r = 0.823 (r 2 = 0.677) so sunspot numbers can, at best, only explain 68 % of the variation in the IMF. The extreme limitations in trying to use sunspot number to directly predict IMF are exposed by the fact that linear regression with the optimum R 0.1 variation yields essentially zero IMF in even the Dalton minimum.
Models of the open solar flux have used non-linear functions of sunspot number to quantify the emergence rate (Solanki et al., 2000(Solanki et al., , 2002Vieira and Solanki, 2010; but the open flux loss rate has been related to the current sheet tilt by Owens et al. (2011), which means it evolves over the solar cycle. Furthermore, the effect of solar wind speed on the Parker spiral means that the near-Earth IMF B is not linearly related to the open solar flux (Lockwood et al., 2009a, b). Therefore the simple linear relationship between near-Earth IMF B and sunspot number proposed by SC10 must be simplistic. Nevertheless, the models do predict a monotonic relationship (e.g. , such that the range of B predicted over a solar cycle falls when the average sunspot number falls. Therefore it seems inconceivable to us that the lower average sunspot numbers seen during the Dalton minimum and even lower numbers seen during the Maunder minimum, did not produce lower annual mean values of B than were measured in 2009. Thus we do not agree that the fact that 2009 gave the lowest annual mean B since 1846 means that this is any form of meaningful minimum or "floor" value, as has been proposed by SC10. This is not to say that there is not a minimum B below which the IMF would never, in practice, fall (unless the solar dynamo somehow switches off completely). SC07's initial floor estimate of a minimum near-Earth IMF of 4.6 nT in annual means was actually 0.6 nT larger than was subsequently observed just two years later. SC10's subsequently revised estimate of 4.0 nT means that the IMF would never go lower than was seen during the recent solar minimum, even though the surrounding solar cycles are more active than a grand solar minimum (such as the Maunder minimum) or even a less-deep minimum (such as the Dalton minimum). Cliver and Ling (2011) have produced an estimate of about 2.8 nT in annual means, but because this is obtained by extrapolation this has considerable uncertainties. This estimate is still somewhat larger than 1.80 ± 0.59 nT in 22 yr means derived for the end of the Maunder minimum from cosmogenic isotopes by Steinhilber et al. (2010). The numerical modelling by  predicts that the signed open solar flux in the Maunder minimum fell to an average of about F S = 0.5 × 10 14 Wb which, using the non-linear variation of Lockwood et al. (2009a) and Lockwood and Owens (2011) corresponds to a B of about 2.2 nT. Hence there is some convergence in these estimates of the likely minimum annual mean B, but note that annual values from the predictions of  consistently fall to 0.5 nT during the Maunder minimum. Furthermore, the modelling of  gives some real insight as to what sets the minimum value. These authors postulated that a base-level CME rate persisted throughout the Maunder minimum at the rate seen during the recent low solar cycle minimum giving a base-level emergence rate of open flux that persists even when sunspot numbers fall to zero over a prolonged period. Using this postulate, they were able to not only explain why cosmogenic isotopes continue to cycle during the Maunder minimum, but also why the phase of these oscillations is shifted by 180 • . Hence it is not that we consider it at all unlikely that there is value of the annual mean IMF which is very unlikely to ever be undercut: however, we do think that any minimum seen during the interval 1845-2012 is not setting any meaningful minimum estimate for times of lower solar activity. Lockwood et al. (1999) noted that means of the open solar flux over the solar cycle doubled over the 20th century before a decline that began in about 1956. Because this caused some debate (e.g. Svalgaard and Cliver, 2005;Lockwood et al., 2006a, b), it is worth investigating how the present reconstruction compares with that finding. The open solar flux, F S , derived using kinematic correction of Lockwood et al. (2009b)  (2) Annual means of F S cross correlate with the IDV(1d) composite with correlation coefficient r of 0.816 (r 2 = 0.667, S = 95.82 %). Applying the same fit procedure as used previously to B allows us to reconstruct the F S , but with considerably larger uncertainties due to the lower correlation: the maximum value of λ derived is 1.25 (at the 2σ level) as well as a minimum value of 0.44, with a most probable value of 0.8. The uncertainty is so large because of the great sensitivity of λ to the value of [<F S > 11 yr ] 1903 and the relatively poor correlation between IDV(1d) and F S . This would suggest that the doubling (λ = 1) may be a slight overestimate but is well within the uncertainty band. However, we note that Parker's spiral theory predicts this simple correlation will give an underestimate because the rise in solar wind speed deduced between 1903 and 1956  will have caused the average spiral configuration of the field to unwind, i.e. the value of B for a given F S will have fallen. Allowing for this effect we find the IDV(1d) values give λ ≈ 1.3, meaning the doubling was an underestimate, but still within the uncertainty band.
It should be noted that all geomagnetic reconstructions only give us information on the near-Earth heliosphere, i.e. in the ecliptic plane. The Ulysses result tells us that the radial component of the IMF applies to all heliographic latitudes and we can use this to compute open solar flux . However, the different solar wind speeds out of the ecliptic and the latitudinal effect in Parker's spiral theory, mean that the field magnitudes at other latitudes will not be the same as derived within the ecliptic plane. Because cosmic ray fluxes at Earth can be modulated by the heliospheric field at all latitudes, this point must be remembered when comparing geomagnetic reconstructions with the measured abundances of cosmogenic isotopes; it means that although general agreement is expected, it will not always be perfect.
Part 3 (Lockwood et al., 2013b) of this series will use a new range index composite to also reconstruct B from 1845 to present, but using a variant of the method of  which uses the 27 d recurrence to eliminate the effect of solar wind speed. , have adapted the method of Lockwood et al., 1999, to predict IMF B rather than open solar flux.) This reconstruction will also allow the solar wind speed to be reconstructed over the same interval, given that the range indices have a dependence on BV n , with n near 2 (as shown in Fig. 1).