Automatic identification of magnetic clouds and cloud-like regions at 1 AU: occurrence rate and other properties

Abstract. A scheme is presented whose purpose is twofold: (1) to enable the automatic identification of an interplanetary magnetic cloud (MC) passing Earth from real-time measurements of solar wind magnetic field and plasma quantities or (2) for on-ground post-data collection MC identification ("detection" mode). In the real-time ("prediction") mode the scheme should be applicable to data from a spacecraft upstream of Earth, such as ACE, or to that of any near real-time field and plasma monitoring platform in the solar wind at/near 1AU. The initial identification of a candidate MC-complex is carried out by examining proton plasma beta, degree of small-scale smoothness of the magnetic field's directional change, duration of a candidate structure, thermal speed, and field strength. In a final stage, there is a test for large-scale B-field smoothness within the candidate regions that were identified in the first stage. The scheme was applied to WIND data over the period 1995 through mid-August of 2003 (i.e. over 8.6 years), in order to determine its effectiveness in identifying MC passages of any type (i.e. N S, S N, all S, all N, etc. types). (N S refers to the B component of the magnetic field going from north (+) to south (-) in GSE coordinates.) The distribution of these MC types for WIND is provided. The results of the scheme are compared to WIND MCs previously identified by visual inspection (called MFI MCs) with relatively good agreement, in the sense of capturing a large percentage of MFI MCs, but at the expense of finding a large percentage of "false positives". The scheme is shown to be able to find some previously ignored MCs among the false positives. It should be effective in helping to identify in real time most N S MCs for magnetic storm forecasting. The N S type of MC is expected to be most prevalent in solar cycle 24, which should start around 2007. The scheme is likely to be applicable to solar wind measurements taken well within 1 AU to well beyond it. Keywords. Interplanetary physics (Interplanetary magnetic fields; Solar wind plasma) – Magnetospheric physics (Solar wind-magnetosphere interactions)


Introduction
The importance of interplanetary magnetic clouds (MCs) to the study of geomagnetic activity has been known for many years (e.g.see Burlaga, 1995), and because of the usual characteristics of these large structures, e.g.relatively strong magnetic field intensity, such activity is often major.Also, because of their specific properties, especially due to their size, axial inclination, and field handedness ( Rust, 1999;Zhao and Hoeksema, 1998), MCs can often be related to distinct solar events (e.g.Gopalswamy et al., 1998;Webb et al., 2000).In particular, MCs are well associated with Coronal Mass Ejections (CMEs) in coronal images and disappearing filaments (e.g.Rust, 1994;Bothmer and Schwenn, 1994;Gosling, 1997;Berdichevsky et al., 2002).With few exceptions, and depending on the chosen level of qualification, interplanetary MCs are observed to be large magnetic flux ropes in the solar wind (Marubashi, 1986;Lepping et al., 1990), but of a special kind; for a discussion of various kinds of magnetic flux ropes and their models see Priest (1990).Strictly speaking a MC was originally defined empirically in terms of in-situ spacecraft measurements of magnetic fields and thermal plasma in the interplanetary medium.That is, it is a region in the solar wind having: (1) enhanced magnetic field strength, (2) a smooth change in field direction as observed by a spacecraft passing through the MC and (3) low proton temperature (and low proton plasma beta) compared to the ambient proton temperature (Burlaga et al., 1981;Burlaga, 1988Burlaga, , 1995)).MCs are also known to evolve (e.g.Osherovich et al., 1993;Bothmer and Schwenn, 1998;Berdichevsky et al. 2003), and they are understood tacitly to be large structures, so that their durations are long, usually between about 10 and 48 h at 1 AU, averaging about 21 h, although some durations have been as short as 5 h.This feature of relatively long duration is to be part of our explicit definition of a MC.See Lepping et al. (2003) for an average MC profile at 1 AU in terms of basic scalar quantities, such as field magnitude, density, proton thermal speed and proton plasma beta, based on actual WIND observations over several early years of the mission.Also see Bothmer and Schwenn (1998); Mulligan et al. (1998), and Lepping and Berdichevsky (2000) for other properties of MCs including some of their quantitative variations from the active to the quiet part of the solar cycle, based on many spacecraft sets of observations.
We are concerned here with developing an automatic and objective scheme for identifying MCs.As pointed out by Shinde and Russell (2003), in attempts to identify interplanetary coronal mass ejections (ICMEs including MCs) in any given set of solar wind data over, say, a several year period by various independent research groups, there is often disagreement on even the total number of events, much less agreement on the exact start/end times for each event; see, e.g.Gosling (1990Gosling ( , 1997) ) on the defining properties of CMEs and/or ICMEs and Gopalswamy et al. (1998) on some ideas on the relationship between CMEs and MCs.There are probably many reasons for disagreements among independent lists of MCs (and probably similarly for ICMEs).Some examples of these are the following: (1) the willingness of some to allow unusually short-duration structures in their definition, and others not, (2) some fraction of events with distant spacecraft encounters (i.e.distant from the MC's axis), making identification difficult, (3) disagreement on what minimum-limit to place on the average field intensity, and (4) even psychological factors, such as the identifier becoming fond of MCs and therefore identifying more and more of them as time progresses, relative to others, or the opposite tendency of developing "higher standards" as time goes on.In the last possibility the identifier progressively learns the "true" character of a MC and gets stricter in identification as time progresses, and hence, finding fewer and fewer relative to other identifiers.This is to say that often the necessary objectivity in MC (or ICME) identification has been lacking.So we try to rectify this with the development of an objective scheme to identify MCs, or at least magnetic cloud-like regions, based on our experience with past MCs found from "visual inspection" and model testing.Probably a similar scheme could be used for identifying ICME's also, provided other physical quantities are examined (e.g.solar wind composition, Forbush-like decreases (Forbush, 1938), etc.), as well as those quantities considered here.
The original purpose for the development of this MC automatic identification scheme had been for assisting in geomagnetic storm forecasting under special conditions.Specifically, we were concerned with predicting in real time the latter part of a MC (say the latter ≈1/3 of it) from the early part for cases with a North-to-South (N⇒S) structure, i.e.where the B Z component of the interplanetary magnetic field (IMF) within the MC goes from positive to negative (e.g. in a GSE coordinate system (e.g.see Mish et al., 1995)).This would be a predictive mode of the scheme.The N⇒S type of MC is expected to be most prevalent in solar cycle 24, which should start around 2007, extrapolating from the predictions of Bothmer and Rust (1997).However, as we will see, as early as 2000 there appears to have been a slight increase in frequency of N⇒S types.This appears to be consistent with complex field polarity reversal of the Sun over the years 2000-2002, as the profile of the sunspot number around this time, showing a double peak (see Fig. 1), seems to indicate.However, such double peaks or broad peaks in the sunspot number are not uncommon (and not easy to interpret).For example, around 1990 such an apparent double peak also occurred.
A MC prediction scheme requires understanding of typical MC characteristics and, at a minimum, the availability of such an identification scheme as developed here (or a similar one) for use in real time.However, in this present work we are mainly concerned with being able to identify objectively and automatically (via computer usage) a MC when field and plasma data are available before, during, and after the MC from post data-collection.We refer to this as the detection mode of the identification scheme.For the real-time predictive mode of MC identifications, we are faced with the more difficult task of identification with only part of the MC available, say approximately the first 2/3 of it.An earlier successful attempt at developing a means of interplanetary flux rope detection was by Shimazu and Marubashi (2000).Since these authors were strictly looking for flux ropes, they examined various aspects of only the magnetic field.For example, they did not quantitatively examine the proton temperature or proton plasma beta associated with the candidate structure, as we must for MCs.They also considered only a small number of parameters in their identifications.Another difference is that we are aiming for developing a MC identification program for use in an eventual prediction scheme, as pointed out, but we stress that the scheme must also provide consistency of identification in a detection mode.Chen et al. (1996Chen et al. ( , 1997) ) have also been concerned with forecasting (generally strong) geomagnetic storms in near real time, based on a probabilistic feature-classification technique as applied to the solar wind upstream of Earth.And WIND data was used in some of the testing of the technique showing relatively good results.This technique is applicable to a large variety of solar wind structures without regard to the specific nature of the structure, such as a MC, much less the type of MC.However, the authors do demonstrate the importance of MC applications.Only magnetic field data was used as solar wind input in describing the technique, although the authors state that it can be extended to include plasma quantities, such as density and speed.
We start by discussing some properties of MCs based on our previous best attempts to find MCs by visual inspection, in the WIND magnetic field and plasma data (see a description of WIND/MFI and SWE investigations by Lepping et al., 1995 andOgilvie et al., 1995, respectively), with emphasis on consistency of their properties, starting with MC type (along with average vector profiles), and duration.We then briefly discuss the MC parameter fitting model of Lepping et al. (1990).All of these will play a role in the automatic identification of a MC, although the fitting-model's role is indirect.Then we define a multi-stage automatic identification scheme and discuss its testing.By type here we simply mean the obvious vector field profile of the MC resulting from various MC axial inclinations during passage, such as with respect to the ecliptic plane and to the X GSE axis.Examples are a B Z -profile of the magnetic field that is North to South (denoted N⇒S here), as mentioned above, or S⇒N, or S⇒N (but mostly S), etc. Sometimes the θ B -profile is examined instead of the B Z component, where θ B =sin −1 (B Z /|B|), and where |B| is the magnitude of the magnetic field.The GSE coordinate system is generally employed, only because the MC structure should be considered in a fixed coordinate system (i.e.approximately fixed on the time-scale of 1 or 2 days).The GSM system (e.g.Mish et al., 1995)(or a similar system) should be used in any detailed comparison of parts of a MC to geomagnetic effects, a further stage of the process in the prediction mode.

Types of magnetic cloud profiles
Although there are intrinsic reasons to discriminate among MC types by examining their B Z -profiles for N⇒S, or S⇒N, etc., such discrimination is also important, because of our eventual interest in solar wind-magnetosphere coupling (e.g.Zhang and Burlaga, 1988).Specifically, a MC participating in magnetic reconnection with the front magnetosphere is of interest.In particular, knowing whether a southern field region will exist in a MC and in what portion of the MC this region will occur are of concern.Accordingly, in Table 1 we define 10 different MC categories, to be chosen qualitatively.(We stress, however, that the automatic identification scheme, described below, does not depend directly on such subjective analysis.) For strong, long-duration MCs, categories 1, 5, 11, 12, 13, and 14 , to various degrees, are those expected to be most geoeffective with regard to electromagnetic coupling of the MC with the magnetosphere.Figure 2 shows the distribution of the 82 WIND MCs in terms of the categories of Table 1.
Clearly category 11 (S⇒N) has been most prevalent during the first 8.6 years of the mission; it occurred 35% of the time.This is followed by categories 4 (all N, 15%) and 1 (N⇒S, 15%), the latter, along with category 5, being of most concern in the prediction mode.
The type and polarity of an erupted solar flux rope at the Sun, and the associated MC, are expected to be directly related to the polarity of the Sun's overall magnetic field, as suggested by Bothmer and Rust (1997) (also see Mulligan et al. (1998)), and vary according to the solar cycle number (i.e. even or odd) and phase.In particular, the S⇒N type should be most prevalent in solar cycle 23 (from 1996 to about 2007), and the N⇒S type of MC is expected to be most prevalent in solar cycle 24.In light of this, we examine MC "type" by year for the first 8.6 years of the WIND mission covering most of solar cycle 23.Table 2 shows that, indeed, the S⇒N type was most prevalent through most of solar cycle 23, but the number of N⇒S MCs (shown by the first column, ignoring the year-col.in counting) clearly is increasing in frequency of occurrence for the last 3.6 years compared to the first 5 years of the mission.From the table it appears that, as N⇒S types increase, S⇒N types remain somewhat steady in frequency (i.e. for years [2000][2001][2002][2003].For completeness we show columns 3 and 4 that give all N types and all S types of MCs, respectively.We point out that the specific data character of the MC-type for columns 1 and 2 will be determined mainly by the azimuthal field of the MC, but those MCs in columns 3 and 4 are determined mainly by the MC's axial field, under normal circumstances, simply due to the nature of the MC's geometry.The last column, referring to the total number of MCs, shows a distinct early growth in number of all types and approximate stabilization later, except for the contrary dip in year 1999.

Durations of magnetic clouds at 1 AU
Important to both the identification of a MC (and strictly part of its definition) and for purposes of B Z predictions within the structure is consideration of a typical MC-duration at 1 AU.See Fig. 3 for distributions of WIND MC durations for 82 cases representing the first 8.6 years of the mission.We split up the cases according to their quality (Q 0 ), taking into consideration how well a flux rope model satisfies the observed MC's field (see Sect. 2.3).See Appendix A for a definition of Q 0 and its three levels (the third level is Q 0 =3 for poor quality).For the good/fair quality combination (Q 0 =1,2) the distribution is approximately normally distributed with an average of 22 h and a most probable value of 19 h.Hence, a typical MC center-time should be about 10 h after the estimated start-time.This fact should be helpful in attempting to identify actual MCs from candidate MC regions that otherwise may be too long or too short to be be-lievable.Also, if we are to examine ≈2/3 of a typical MC in the prediction mode, we are aiming at 2/3×(20 h) ≈13 h of data.For the MC identification scheme we chose a minimum allowed duration of 8 h based on the center distribution (Q 0 =1,2) shown in Fig. 3; in this figure it is also shown that a disproportionate number of the Q 0 =3 (poor) cases occur at quite short durations.Note that only 6 cases of the 82 WIND MCs had durations shorter than 8 h, and one of these was apparently due to a distant passage of the spacecraft.Hence, only ≈6% (the 5 legitimate cases, with three having Q 0 =3) fall into this very short-duration category.Notice, however, that all three distributions in Fig. 3 are relatively broad (with the Q 0 =3 set being very skewed).

Magnetic cloud model fitting procedure: some background
The MC fitting model of Lepping et al. (1990) is used in part of the study for getting background information, especially where the fitting of the "average" MC is concerned; this model is based on ideas expressed earlier by Burlaga (1988).
(Other MC fitting techniques have been used over recent years with varying degrees of success; see e.g.Riley et al. (2004).)This is a cylindrically symmetric local model, which uses Bessel functions for fitting the axial (J 0 (αr)) and azimuthal (J 1 (αr)) components of the MC's assumed flux rope's field, where the radial component (i.e.perpendicular to the MC's axis) is zero everywhere; the scaling factor α is constant in our model.See Goldstein (1983) and Lundquist (1950) for background information on this form of the flux rope solution, and Marubashi (1986Marubashi ( , 1997) ) for aspects of the geometry and origin of these structures.This model gives net helical fields on cylindrical shells of  and 3=poor).The center set is for the combination of good and fair sets, the set on the right is for poor quality cases, and the set on the left is for the total number of cases.
different pitches according to distance from the axis.We assume that most MCs at 1 AU have such helical fields, even if only to some rough approximation (see Lepping and Berdichevsky (2000)).This is to say that we have greater confidence in the classification of a structure as a "magnetic cloud" when we can perform a reasonably successful flux rope fitting to a field structure in the solar wind.But strictly speaking a MC does not have to possess a flux rope structure according to the original definition (Burlaga, 1988;1995).Use of the fitting model has been important, occasionally, in confirming suspected MCs after their candidate cases were found through visual inspection by members of the WIND/MFI team.The resulting successful set of MCs is then called the "MFI set," composed of N MFI =76 cases, for our purposes, as discussed in Sect.2.2.(A summary of the results of the analysis of the first 8.6 years of WIND MCs, mostly identified by visual inspection, is given by Lepping et al., 2005.)This is relevant, because the results of the automatic identification scheme (described below) will be compared to the MFI set.First, in order to measure the quality of the MC fit, a "reduced" chi-squared measure of the fit is calculated (i.e.χ 2 /(3N−n), where N is the number of field averages (usually 15 or 30 mins long) used, and n=5 is the number of parameters in this part of the fit), along with other parameters that consider symmetry and reasonableness.The chi-squared parameter is dimensionless, since the magnetic field was unit normalized up to this point; strictly speaking |χ R |≡(χ 2 /(3Nn)) 1/2 is displayed.The full set of 7 fitted parameters is: -B 0 , the MC's axial field intensity; -H, the handedness of the field twist within the MC; -R 0 , the radius of the MC; φ A , θ A , the longitude and latitude of the MC's axis (GSE coordinates), respectively; -t 0 , the MC's center time; and -Y O, the closest approach (CA) distance, which is usually given in terms of Y 0 /R 0 (often called the impact parameter), which is sometimes given as a percentage.
-f flag is the convergence flag: The fitting process did converge=OK, or it did not=NOT.
The last 5 parameters, excluding the flag (i.e.R 0 , φ A , θ A , t 0 , and Y 0 ), are the n=5 considered in the reduced chisquared fit process.Note that we choose the boundaries of the cloud such that the magnetic field becomes purely azimuthal there, i.e.where αr=2.4 (then r=R 0 ) in the Bessel functions."Quality" of the fit depends on the ten quantities described in the Appendix A. We stress that this model is not used directly in the automatic MC identification scheme.But without use of the model (or some model) we would not be able to develop such a quantitative means of judging quality for MCs.Also the model is useful in helping to find unifying background information on MCs, i.e. any unique MC properties.We looked for such unique features by creating average magnetic field profiles from carefully selected superimposed MCs of good quality (Q 0 =1 or 2) delineated according to type (N⇒S, S⇒N, etc.) and handedness (H=R or L).These average profiles were then fitted by the Lepping et al. (1990) model and shown in Fig. 4.Each of the fittings was based on 25 points (averages), i.e. the average used was 1/25th of the full duration, in each case.The directions of the fields in all four combinations were fitted very well, but the magnitudes were not as well modeled, typical of this model.
As an example of the application of the Lepping et al. ( 1990) fitting technique see Table 3 which provides MC Fig. 4. Shown are magnetic field profiles of superimposed MCs (solid curves) of good quality (where there are no Q 0 =3 cases), vs. percentduration, separated according to type (N⇒S, S⇒N, etc.) and handedness (H=R for right-handed or L for left-handed).Data are rendered in GSE coordinates, in terms of (in order from top to bottom in each frame): magnetic field longitude(φ) and latitude (θ), field components (Z, Y, X) and field magnitude (|B|).The top set are the N⇒S cases, of special concern to us here, and the bottom set are the S⇒N cases.The dotted curves are model-fitted results from the Lepping et al. (1990) MC model.The values of N at the top left of each of the φ-frames show the number of MCs that went into each set's average profile.
parameters for the event of 3-4 April 1995; the quantities "Check(%)" and "ASF(%)" (asymmetry factor) are defined in Appendix A. Figure 5 shows the results of the model fitting vs. actual field observations for this case.This event is emphasized, because it was discovered by the identification program developed here -not because it is exceptional in any way, except for having a low speed of 301 km/s.It has a quality assessment of Q 0 =2.As usual the θ B and φ B profiles are reasonably well fit by this model, but |B| is less well fit.Note that the peak in the model's |B| is well centered.The observed B-magnitude profile was typical in the sense that it had a high intensity in the early part (i.e.approximately the first 1/2 and a low intensity in the latter part, compared to the model (see Lepping and Berdichevsky, 2000).Also the observed |B| was somewhat low, apparently due to the relatively large CA (=Y 0 /R 0 ) of 0.71.There was no upstream shock, probably because of the MC's slow speed.

The identification scheme as part of the prediction scheme
Since the N⇒S MC part of the identification scheme is planned to be part of a prediction process for magnetic storms, i.e. for storm intensity (measured by D st ) and timing,  3.
it is important that the identification program used in that way be placed in context.We briefly explain that context here.The N⇒S prediction program will consist of five stages: (1) identifying the proximity of a cloud-complex, i.e. the early part of a MC and the immediate upstream region, and determining the MC's type (N⇒S or S⇒N, etc.), as described in Sect.4.0, below.Then for the N⇒S type of MC (see Fig. 6 for an example of a N⇒S MC occurring on March 4 and 5 of 1995, which produces a magnetic storm): (2) finding, relatively accurately, only the front boundary of the MC us-ing finer scale data, than those used in identifying the cloudcomplex, (3) estimating the MC's "center time," (4) predicting VB Z at minimum B Z and its occurrence time within the MC (based on these earlier findings), and finally, (5) estimating the associated (D st ) Min , based on reliable (B Z ) Min (or (VB Z ) Min ) vs. D st relations (e.g.Burton et al., 1975;Tsurutani and Gonzalez, 1997;Wu andLepping, 2002, 2005), as well as its occurrence time.By contrast, we attempt to find accurate rear boundary times as well when in the detection mode only, since such estimates are possible in this mode, Fig. 6.Profiles of magnetic field and plasma parameters for the N⇒S MC of 4-5 March 1995, in terms of (from top to bottom): χ 2 of a quadratic fit to latitude of the field (θ B ), running average of proton plasma beta (β) and dotted curve representing its running average, D st , magnetic field in terms of magnitude, latitude (θ B ) and longitude (φ B ) in GSE coords., induced electric field (VB S ), B z of the field in GSE, (see, Akasofu, 1981), proton plasma thermal speed (V T h ), bulk speed (V), and number density (N P ).The formula for D st MIN in the D st panel (Wu and Lepping, 2005) is used to estimate the min value of −84 nT, which is in good agreement with the observed D st at min.The gray horizontal bar in the top panel represents the scheme's identification of the extent of this MC candidate.but they are expected to be more difficult to estimate than the front boundary-times, as experience has shown.
Again, in the case of the 4-5 March 1995 MC the thermal speed (V T h , panel 10 in Fig. 6) was low on average, but was very low in the central region.This is not uncommon for interplanetary MCs at 1 AU and will be utilized in helping us to automatically identify MCs, as we will see in Sect.4.1.Referring to Fig. 6, χ 2 , based on θ B variation (to be described in Sect.4.1) is an indicator of the relative smoothness of the field's latitude change; low χ 2 means a smooth change, as expected for a MC.The slope of θ B ) in the early part of the MC is negative (i.e. over the first 6 h of the MC), indicating a N⇒S type of event.The average |B| is 11.2 nT and the average speed is 447 km/s, as shown in the panel.And density (N) across the MC had a typical average for MCs at 1 AU (≈11/cc), but it also had an (typically) irregular profile which is not very helpful in MC identification (e.g.Lepping et al., 2003), and therefore density has not been used for MC identification.As expected, proton plasma beta (β) was very low throughout the MC.The bulk speed (V), which is usually characteristically very regular and often uniformly decreasing, indicating MC expansion, was not so here, and therefore not of general use for an automatic identification scheme.Figure 6 shows that during the MC−B z , VBs, and (see Akasofu, 1981) are relatively large indicating that significant but moderate geomagnetic activity is expected around that time consistent with the observed min-D st of −90 nT, which we modeled to be −84 nT (Wu andLepping, 2002, 2005).Some of these MC properties will help guide us in developing an automatic scheme to identify such structures in the solar wind.
Finally, we do not suggest that only MCs cause magnetic storms, but MC-caused storms can more easily accommodate IMF prediction schemes, because of the relatively smoothly changing fields within a MC, strong |B|, and the MC's usually large size.And MCs are often associated with the most intense storms (e.g.Tsurutani et al., 1999) making studies of this kind of causal process more compelling.

The automatic MC identification scheme
We split the analysis into two main phases: (1) to find good MC candidates based on basic, but short time-scale, MC characteristics and (2) later to test these candidates for long time-scale field variations, i.e. on the scale of a typical MC duration.In this manner, since the latter is more computationally intensive, we are applying it to a much reduced portion of field data, i.e. those regions assessed to be good candidates found from (1).And we also use longer-averaged data in the long time-scale test, thus also saving computer time.

First phase of the scheme: finding good MC candidates
The scheme to find a candidate MC region is carried out in six steps.As pointed out, it is important to aim for consistency in identifying MCs, but more important is arriving at a sensible identification scheme, one faithful to observations.We attempt this by using our past experience in identifying these structures in the solar wind (Lepping et al., 1990 andLepping andBerdichevsky, 2000) and by adhering to the original definition by Burlaga et al. (1981) (also see Burlaga (1995)).In applying the scheme we pass through a relevant physical data set (quantities defined below) at a fixed t analysis-interval length at a time, moving at a small step-size (being t) at each step.The MC identification is based on the following requirements, which apply within the MC's extent: the proton plasma beta must be low, the average magnetic field strength (|B|) must be relatively high, the field directional changes must be smooth (based on consideration of the latitude, θ B , of the field), the region of interest must have some minimum duration ( T), the average proton thermal velocity (<V T h >) must be low, and the maximum field directional change across the region must be greater than some lower limit-value ( θ B,L ).The last criterion is needed so that model-fitting could be possible (or at least conceivable).In all cases where relative measures are made they are *Using the model of Lepping et al. (1990) done with respect to typical solar wind values.The identification scheme is given in quantitative terms below with firsttrial values given for each of the relevant free parameters, all of which are adjustable.These first-trial values should not be considered as most optimum or final.
Steps in the prediction criteria: 1.The running averages of step size 1 min of proton plasma beta (<β P >), based on analysis intervals ( t) of 30 min each, must be small, i. e., <β P >≤0.3 (≡ <β P > L ).The step size of 1 min was convenient, because the data set from which these were taken were based on a 1 min average rate, but testing lead to a preference for this rate also.For example, 5 min steps were also tried with less satisfying results.
2. The direction of the magnetic field must change slowly.Specifically, χ 2 's for quadratic fits of θ B (latitude) of the field, based on 1 min averages over 30 min running intervals ( t), are examined.Only low values (i.e.χ 2 ≤ 450 (≡χ 2 L )) are accepted.(It was shown that the χ 2 for neither the field's longitude, A , nor its cone angle β CA is a good discriminator of the cloud region, where β CA is the angle between the cloud's axis and X GSE , i.e. cos β CA =cos φ A cos θ A ). (χ 2 should not be confused with χ 2 R in MC parameter-fitting described in Sect.2.3.) 3. The duration of the candidate MC must be at least 8 h long, so T ≥8 h.Regions satisfying 1, 2, and 3 are designated the "black" region, which is examined further in terms of the following absolutes.
4. The average of the magnetic field magnitude (<|B|>) across the black region must be ≥8.0 nT.Below we will refer to this particular choice of values (or limits) for the parameters t, <β P > L , χ 2 L , T, <|B|> MIN , <V T h > MIN , and θ B,L , as the "Strict" set, and we call the parameter variables themselves "identification test-parameters."Those black regions satisfying (4), ( 5), and (6) are designated "gray"; see Figs. 6 and 7 for examples where the regions of the candidate MCs are denoted by the gray horizontal bar regions where the Strict criteria were used.Concerning Fig. 7, some features of interest within the MC are: a linearly decreasing V (except near the end), a not uncommon asymmetric magnetic field magnitude, of moderate strength, a low proton plasma β, which at all times is well below 1.0 and increases toward the end where χ 2 also increases.Most MCs show linearly decreasing V indicating expansion (Lepping and Berdichevsky, 2000).Notice that the increases of χ 2 and β near the beginning and end of the candidate MC are just slightly outside of the MFI MC interval.This is not uncommon and gives an indication of the limits of agreement between these two methods of MC identification.

The average proton thermal velocity (<V
Another set of criteria were also tried, as applied to the full 8.6 years of WIND data.This set is referred to as the "Loose" set; see Table 4 for the identification test-parameters for both the Strict and Loose sets.This nomenclature is used in order to remind us that the Strict set (of tighter requirements) is expected to result in a smaller number of MC candidates than the Loose set which, in fact, was the case.This first test may result in any type of MC (N⇒S, S⇒N, etc.), provided this candidate event is a MC.Notice that these candidates are lacking in one last test: we must check for smoothness of field directional variation on a scale consistent with that of a reasonable candidate flux rope, i.e. at a lower frequency than was considered so far, which was on the basis of 25 to 30 min intervals.We take on this challenge in Sect.4.3 below, but we first wish to test how close we are, at this stage, to identifying with this scheme the MFI MCs found through visual inspection, and confirmed with the help of the MC parameter fitting analysis, described in Sect.2.3.

Results of the first phase for the detection mode
Of the 82 MFI MCs obtained by visual inspection not all were comparable to the criteria used here, e.g. they must be at least 8 hrs in duration, and a few "MFI" cases violated this.Also, the interplanetary manifestations of the Bastille Day events (i.e.days occurring on 14 through 16 July 2000) (e.g.see Lepping et al., 2001) were not on-line and not easily used in this statistical study, and were, therefore, excluded from consideration.Hence, only 76 MCs (of the MFI set) were used in the %-comparisons.The start/end times for the full 82 MCs are provided on the WIND/MFI Website with the URL http://lepmfi.gsfc.nasa.gov/mfi/mag cloud pub1.htmlThe listing also provides the estimated quality (Q 0 ) for each MC according to the Lepping et al. (1990) model and Appendix A.
Figure 8 (top) presents the results of application of the scheme for all types of MCs, in a "pie chart" representation, in terms of the degree of agreement with earlier visually identified MCs.That is, comparisons of the scheme's results (but prior to any test of long time-scale reasonableness) are made with the 76 MFI MCs; (Fig. 8a) gives results for the Strict identification test-parameters used in the scheme and (Fig. 8b) gives the results for the Loose set.All cases (including Fig. 8c)) are expressed in terms of an "agreement" (with MFI), or a failure, or false positives, etc. (A false positive refers to a program-identified candidate "MC" that was not part of the MFI set; this does not necessarily mean that it was not a correct identification of a MC/solar ejectum.Also, for an agreement it is sufficient that the front boundary, as estimated by the two methods, agree within several hours; Fig. 7 gives an example of such a small displacement in estimated start-times (and end-times in this case) between the two methods.)From Fig. 8 (top) we see, as expected, that the Strict set (A) gives many false positives (orange region, 59%), and it also does not provide a very large percentage of Fig. 7. Magnetic field and plasma data in the same format as that of Fig. 6 showing an example of a S⇒N MC profile, the 10-11 October 1997 case.This was a good quality Q 0 =1 case.The program provided the 1st candidate black bar (first panel; see text) and the 2nd level candidate gray bar (at the top of the first panel).The vertical solid and dotted lines indicate the estimated start and end times of the gray-bar region, respectively, which are in reasonable good agreement with those times estimated earlier for this MC by visual inspection (MFI MC set); in the second panel is shown the MFI-estimated MC interval.MFI MCs (red, 59%).That is, it does not find a very satisfying number of agreements and has a large number of false positives.The criteria were too strict to obtain many good agreements.At the opposite extreme, using the Loose set (B) we see many more MC candidates, both many agreements (left-side red) along with still many false positives.That is, in going from Figs. 8a to b we obtain a distinctly larger 88% rate of agreement, by having to accept only a small percent increase of false positives, (59%) to 68% of all cases found (right side orange).Also, in going from Fig. 8a to 8b we see that the total number of MC candidates found by the program went from N P =111 to N P =211, i.e. from 41% to 32% agreements with MFI cases (right-side red).We take the increased number of false positives from Strict to Loose as an acceptable addition, where it is expected that some false positives will be dismissed later when further editing is done (Sect.4.3).So the "Loose set" of criteria is judged to be the better set to use for automatic detection of MCs, based mainly on the importance of getting high agreements with the MFI set on the right side of Fig. 8b.We do not claim that the "Loose set" is the ultimate or optimum set.Since a MC has too many unique characteristics, it will not be easy to find an optimum set of automatic-selection-criteria for MCs generally.
The bottom part of Fig. 8c gives results for only the N⇒S types of MCs using only the Loose criteria since, as we saw in Fig. 8b, the Loose criteria provided a bigger percentage of agreements with (or recovery of) the MFI cases.On the left of Fig. 8c we see that there were 83% agreements, comparable with the 88% for all types.Likewise, there were 61% false positives, and 39% agreements with all candidate MCs found by the scheme, also comparable with, but slightly better than, the full set of MC in (Fig. 8b, right side).
Finally, we should point out that occasionally the MC identification scheme can find an actual MC that was overlooked by visual inspection.As mentioned, a good example of this is the N⇒S MC of 3-4 April 1995 (see Fig. 5 and Table 3).So some "false positives" found by the identification scheme are not necessarily failures of the scheme to find legitimate MCs; they may indicate failures of the earlier visual inspection method.Also, some of the false positives, which may not be bona fide MCs, may still be the remnants of solar transient events worthy of further consideration.

Editing of MC candidates based on large-scale magnetic field variation
As we have seen, it was necessary to "open up" the candidate MC criteria to the Loose Set, in order to approximate the number of MCs found by visual inspection (MFI MCs).This provided a larger than expected number of "false positive candidates," but some of these are expected to fail when the magnetic field within the candidate MC is tested for smoothness on a large scale, i.e. a scale consistent with a typical MC's duration.So we require a compromise in our choice of criteria: (1) Loose criteria were required to capture enough MC candidates to approximate the number of MFI MCs, along with the greater number of false positives (of the two sets of criteria considered), but (2) relatively strict criteria are required in our choice of parameters for editing out candidate MCs (from the false positives) on the basis of large-scale B-variation.Our choice for this editing is to fit, across the entire candidate MC, the three field components (in GSE coordinates) separately, to a simple polynomial.As testing shows, any more complicated function seems to be unnecessary, since we are not accounting for any of the specific parameters defined in Sect.2.3, such as the estimated spacecraft's closest approach distance, Y 0 /R 0 , the axial direction φ A , θ A , R 0 , etc.It turns out after many trials that a quadratic form is sufficient for this fitting.Then the separate chi-squared values (χ 2 x , χ 2 y , χ 2 z ) of the quadratic fits to the field components are combined to form a Pythagorean mean χ 2 M .The Pythagorean mean χ 2 M is, therefore, , where for j=x, y, z.Note that if any one of the component χ 2 j s is large, χ 2 M will be large, and for χ 2 M to be small, all three terms must be small; these are desired features.We further normalize χ M (= √ χ 2 M ) by the average field magnitude across the MC (<B>) to obtain χ M /<B>.That is, we are concerned with examining relative fluctuation levels.
We then use the computed value of χ M /<B> to separate good (low ratios) from suspicious MC candidates.The idea here is that, since only the quadratic and lower frequency terms are used, consistent with a typical MC profile, poor fits represent deviations from smoothness in field directional variation but now considered on the larger scale of the cloud itself.Since only large-scale considerations are being tested here, we use 15-min averages to compute the χ 2 i s.Since applying this somewhat time-consuming process would be prohibitive for application to the full 8.6 years of data, we apply it to only the gray bar regions found in Sect.4.1.Before doing so, a separator value (or lower limit for bad values) for χ M /<B> must be obtained.The average and standard deviation (σ ) of χ M /<B> for the 76 MFI cases were 0.25 and 0.087, respectively.We try a limit for χ M /<B> consisting of its average +2σ , which gives a limit-value of 0.42.We notice that all values of χ M /<B> for this set are lower than, or equal to, 0.42, except only three, and these three were of low quality, Q 0 =3.We also examine a year's worth (1997) of ordinary interplanetary field data, as a control set, to ascertain each day's value of χ M /<B>, and find that very few days (that were free of obvious solar ejecta) had (χ M /<B>)s lower than 0.42.For the control set the average of χ M /<B> was 0.67 with a σ of 0.17.Hence, we choose (χ M /<B>) L =0.42 as the separator value between good and bad cases of MC candidates (i.e.gray bar regions from Sect.4.1) with respect to this large-scale smoothness criteria.Figure 9 shows three histograms of χ M /<B>: one for the MCs as ascertained from the MFI set (solid lines), one for the set found by our automatic identification scheme (dashed lines, discussed below), and finally one for the 1997 control set (dotted lines), for comparison.It is clear that there is very little overlap of χ M /<B> between the control set and the other two sets, and that therefore, (χ M /<B>) L =0.42 is a good separator, where strictly speaking, (χ M /<B>)≤0.420are the good ones.(Notice that this choice of separator attempts to keep a very large percentage of MFI MCs at the possible sacrifice of not dropping more false positives than we otherwise could have with a smaller (χ M /<B>) L ).We now apply this requirement to the magnetic field in all of the gray bar regions of WIND data.
Recall that after application of the criteria of Sect.4.1 to the full 8.6 years of WIND data we obtained 211gray bar (serious candidate) regions.For these regions we find that χ M /<B> was on average 0.30 (compared to 0.25 for the MFI set, above) and its σ was 0.09 (compared to 0.087).So statistically this set's average ratio was similar to, but slightly higher than, that of the MFI set.But more important, of the 211 candidate MCs 183 (≡N AUTO ) were acceptable and 28 were "unacceptable" (with (χ M /<B>)s>0.420).That is, 13% of the candidates were unacceptable, and therefore, they will not be retained as MCs.This is considered the final step of the discrimination process between MCs, or more strictly MC-like regions, and any other kind of solar wind data.It is interesting that only 13% are lost by using this last criterion.This seems to imply that implementing the criteria of Sect.4.1 alone is almost sufficient to pin down solar wind structures that are magnetic cloud-like.That is, when the magnetic field is smoothly changing in direction over intervals of only 25 min each, within an event of 8 h or more in duration in the solar wind at 1 AU, and when these regions satisfy all of the other criteria of Sect.4.1, the field is also likely to be smoothly changing on the longer scale of 20 h or so, at least to the level of variation of a quadratic fit, and therefore it appears "cloud-like."Figure 10 shows the final results of the analysis of 8.6 years of WIND data in terms of a time-dependent distribution of occurrence of MCs resulting from both this scheme (white bars of 1/4 year each) and from the earlier set based on visual inspection (dark gray bars, the MFI set); when they are equal they are shown as a light gray bars.The white bars are almost always larger than the dark gray ones, with few exceptions.It is evident that the auto-Fig.9. Three histograms of χ M /<B>: (solid) for the MCs as ascertained from the MFI set, (dashed) for the set found by our automatic identification scheme, and (dotted) for the 1997 control set, for comparison.Notice that there is little overlap of χ M /<B> between the control set and the other two sets.Fig. 10.An occurrence distribution of WIND MCs from visual inspection (shown by dark gray bars, the MFI MCs) with total N MFI =76 events, and an overlaid occurrence distribution of MCs from the automatic identification scheme (white bars), where total N AUTO =183 and where both sets were based on 8.6 years of WIND data.Each bar is a quarter of a year wide.The five light gray bars represent quarters when both means of choosing MCs gave an equal number.Notice that the year designations are centered at the start of each year.matically identified set is much larger than the MFI set, i.e.N AUTO /N MFI =183/76=2.4.
As we use it, the term cloud-like is a broad one meaning that the solar wind structure being considered appears to be a MC according to all of the tests previously used to make that determination.These tests comprise those in Sect.4.1 and in this section, in aggregate, addressing what has been considered all of the reasonable elements of a MC's definition.Among the full set of cloud-like cases of N AUTO =183, preliminary analysis (using the fitting procedure of Lepping et al. (1990)) indicates that only a small subset appear to be bona fide MCs.Perhaps a more advanced MC parameterfitting procedure is required for properly examining these cases.In any case, other users of the identification scheme may want to employ different identification test-parameters, defined in Sect.4.1 or in this section (e.g.running average step size, t, θ B,L ,<β P > L , T, ....., (χ M /<B>) L ), or by using different quality criteria than those in Appendix A.
The occurrence-numbers in Fig. 10 are everywhere between 0 to 13 per quarter-year for the automatically chosen set.For the MFI set we have occurrences from 0 to 6, and having poor, correspondence with the automatically chosen set.The linear correlation coefficient between the two sets over the 35 quarter-year buckets is only 0.58.Also, as Fig. 10 shows, the identification scheme found a far greater number of MCs (or cloud-like events) in 1999 in WIND data than were in the MFI set.(However, notice that even those automatically chosen ones for the first 3/4 of 1999 occur at a much slower rate than for the two previous years.)This is the puzzling year that appeared to have a severe paucity of MCs by visual inspection.In fact, even though there are some similarities in the trends in the two histograms in Fig. 10, it is obvious that the region from just before the start of year 2000 to late-year 2001 shows a significant disagreement between the two sets and apparently plays a important role in driving down the correlation coefficient.The agreement is better from 1995 to about mid-year 1999, on average.

Summary and discussion
Our automatic MC identification scheme can be used to objectively identify MCs in a predictive (real time) mode.It can also be used to help identify MC candidate events after data collection (detection mode).But this study also addresses a few other questions about MCs that needed to be answered to help develop the prediction scheme, such as examination of distributions of MC "types" (N⇒S, S⇒N, etc.), average MC profiles, and aspects of MC durations.It also briefly reviewed a MC fitting model (Lepping et al., 1990) that we have been using.We concentrated on aspects of the detection mode of the scheme.(The full prediction mode for geomagnetic storm is not yet complete.)Our major findings based on WIND data for the first 8.6 years of the mission (i.e. from early 1995 to August 2003) are: 1.The percent distribution of MC types chosen by visual inspection of data (the "MFI" set) has been determined in terms of 10 possible categories (see Fig. 2 and Ta Bothmer and Rust, 1997;Bothmer and Schwenn, 1998;Mulligan et al., 1998).Consistent with these predictions, S⇒N types have been most prevalent since we have observed MCs near the beginning of the WIND mission.However, there appears to be the beginnings of the occurrence of N⇒S types over the 3.6 years starting in 2000 (i.e. 12 of 18 cases, or 67% of this type -see Table 2).
3. The average profiles of many WIND MC events, for strictly N⇒S or S⇒N types of MCs separately, and further separated according to handedness (so four independent sets result), can be model-fitted with relatively good success with respect to field direction.This gives some hope of automatically predicting the latter part of a MC from the earlier part; see Fig. 4. The averaging process required putting all cases on the same time-scale, i.e. specifically on a percent-duration scale.
4. However, obtaining accurate estimates of (B Z ) Min within any specific N⇒S MC is very difficult, because MCs tend to be unique in structure.This is probably due to both the unique birth conditions at the Sun plus any particular interactions the MC has during the 1 AU passage.For example, there are, roughly speaking, three types of "N⇒S" MCs (types 1, 2, and 5 (Tables 1 and  2)), not to mention the broad spectrum of MC sizes encountered.Often neither of these facts can be accurately ascertained before the MC's end-time is observed.
5. For the good/fair quality combination (Q 0 =1,2; see Appendix A) the distribution of the durations of WIND MCs is approximately normally distributed with an average of 22 h and a most probable value of 19 hours; see Fig. 3. Therefore, a typical MC center-time should be about 10 hours after the estimated start-time.This fact should be helpful in attempting to identify actual MCs from quasi-MC regions that otherwise may be too long or too short to be believable.Also, if we are to examine ≈2/3 of a typical MC in the prediction mode, we are aiming at approx13 h of data.
6. Automatically (via a computer scheme) identifying a MC passage and its type are possible with some agreements to earlier (visually determined) cases.To find worthy MC candidates, the scheme uses information on: steadiness of field directional change, proton plasma β P , V T h , duration, and average field strength, along with the results of editing for "large-scale" smoothness in field directional change.We tested two types of parameter-values for these physical requirements, Strict and Loose, and found that it was necessary to use the Loose set, in order to optimize agreement with those MCs originally found by visual inspection (MFI set).
The results from this stage of interrogation give relatively good MC candidates.For the full set of data 88% (67 out of 76) of the MFI set were in agreement.And for the N⇒S set, 15 cases out of 18 were recovered (83%).For this high agreement rate a relatively high number of false positives resulted, being 68% and 61%, respectively, of the total number of events found by the program (N=211 and 38, respectively); see Fig. 8.
7. We stress that we found very few (only 13%) MC candidate (gray bar ) regions that did not satisfy the large-scale smoothness requirement (where (χ M /<B>) >0.42 was considered unacceptable), if they already had satisfied the shorter-scale smoothness criterion and the general MC criteria of Sect.4.1.That is, it appears that if a MC already satisfies the shorter-scale smoothness criterion, and all the other required criteria of our scheme, it is very likely to be a MC or magnetic cloudlike structure, even without any testing for smooth field change over the full extent of the MC of 8 h or more.The implication of this is not clear.
8. If an apparent false positive MC (see Fig. 8) passes the additional test of (χ M /<B>)<0.42we seriously consider that it may not have been a "false" candidate after all, but that it may have been an actual MC that was simply missed during the visual inspection stage for MCs.
In fact, some of these cases were found in the WIND data, the 3 April 1995 MC being one such example (see Table 3 and Fig. 5).However, only a few of such cases are expected to be of high quality (i.e. with a Q 0 =1 or 2), but assertion that must be tested after separate MC parameter fittings.If they are not bona fide MCs, they are what we refer to as cloud-like regions (see Sect. 4.3 for a definition of cloud-like events).9.By use of the Loose set of criteria the automatic identification scheme found a significantly greater number of MCs, or cloud-like events, in 1999 than were previously identified by visual inspection (only 4 cases)(see Fig. 10), but many of these events are significantly shorter in duration and less impressive in other respects as well, than the MFI set.However, because of the way they were chosen, we expect them generally to be associated with solar transient events, even if not bona fide MCs (we leave that question open).And these new false positive events may be important in explaining the low number of 1999 MCs.Perhaps there was a genuine decrease of bona fide MCs in 1999, but also there may have been a change in character of the events making them more difficult to identify visually.As Fig. 10 also shows, the region from just before the start of year 2000 to late-year 2001 indicates an even larger disagreement between the two sets.The linear correlation coefficient for the two sets for the full period of 8.6 years was only 0.58.
10.The "false positives" found by the identification scheme may present us with some new kind of MCs or at least cloud-like regions that are interesting and require indepth examination to understand their nature and how they may be related to solar events, for example.For this reason we have developed a Webpage, as part of the WIND/MFI Website, that lists the start/end times for all of the N AUTO =183 regions found according to the Loose criteria, i.e. for the cases shown as white or light gray bars in Fig. 10.This URL for the Webpage is: http://lepmfi.gsfc.nasa.gov/mfi/MCL1.html11.For MCs of the "S⇒N" categories (i.e.11, 12, 15; see Table 1) identifying passage is possible with at least near-simultaneous ground notification of that fact, along with providing (but not predicting) the value of minimum VB Z at or near the front of the MC.For these types, however, a prediction can be made of when the interplanetary field component B Z will reach a maximum.
The specific elements of the MC identification scheme chosen were the result of our desire to be faithful to the original MC definition, our experience with analyzing many MCs from many different spacecraft in different epochs, and much trial-and-error to obtain near optimum identification test-parameters for the criteria of Sects.4.1 and 4.3 (Table 4 in particular).We do not, however, claim that we have found the optimum set of identification test parameters.

Conclusions
We have developed a scheme for automatically and objectively identifying interplanetary MCs, or at least cloud-like regions, at 1 AU and applied it to WIND data.It is likely that the scheme is applicable for MCs over a broad range of distances from the Sun (e.g.see Bothmer and Schwenn, 1992, who examined MCs (also of low plasma beta) in the inner heliosphere using Helios data; and Mulligan et al., 1998, using PVO data).But it is likely that different selection parameter-values are needed in the scheme for regions other than at 1 AU.But its general applicability has not yet been proven.The scheme utilizes field and plasma criteria based on the original Burlaga (1988Burlaga ( , 1995) ) definition of MCs and on many years of experience in studying their properties, from data taken at various parts of the solar cycle, and at several distances from the Sun, but especially at 1 AU.Some of these properties were examined in Sect.2.1 (B Z distribution in MCs) and 2.2 (distribution of durations).This automatic identification scheme is applicable in either a prediction mode or a detection mode.Most of this study concentrated on the detection mode, but it laid the foundation for the scheme's use in a real time (prediction) mode for possible geomagnetic storm forecasting for MCs having a significant and negative B Z late in the MC.It is partly in this connection that MC durations were examined, because any prediction/forecasting scheme will depend on confidence in our knowledge of the temporal aspects of MCs, duration in particular.Up until August 2003 there is only a small percentage of MCs relevant to the kind of storm prediction described here (N⇒S types).
With the automatic identification scheme in the detection mode we were successful in capturing about 90% of 76 WIND MCs previously identified by our WIND/MFI team, but we also obtained a large set of magnetic cloud-like structures, which we refer to as "false positives."As Fig. 10 shows there are more than twice as many events found by the automatic scheme as there were in the MFI set, over 8.6 years of data, and the difference in the sets is most prominent during the period 1999 to about early 2003.With the belief that the false positives found by our automatic scheme may yet be examples of MCs that were overlooked in the visual inspection process or are some other, possibly new, interplanetary form of solar ejecta, our next step is to examine them in at least two respects: (1) in terms of their ability to satisfy a reasonable MC model and (2) for time-delay consistency with specific solar ejecta and/or indications of CME occurrences, in the manner of Berdichevsky et al. (2002).For the first study we will start by analyzing these regions using the Lepping et al. (1990) MC fit-parameter model in its basic form, so that Appendix A is applicable for judging their quality and for further comparison with the present MFI set of MCs.We also plan to apply a modified version of the model to the cloudlike structures.For example, ambient plasma-MC interaction is often responsible for significant field-compression (and increased |B|) in the early part of an actual MC which is not accounted for in the present model.The time-delay test (second test above) will be important whether the structure is a bona fide MC or not, in that it could testify to the solar origin of the event and possibly even help, along with study no. 1 above, to separate a MC from a non-cloud ICME.
Because of the importance of negative IMF B Z to storm forecasting (e.g.Burton et al., 1975;Detman and Vassiliadis, 1997), we examined the type of MC distribution (S⇒N, N⇒S, all S, etc.) generally seen in the WIND MC set, but with a focus toward developing the specific prediction scheme described here (i.e. for the N⇒S type).Around the year 2007 this type of MC should be relevant for the kind of storm forecasting described here, i.e. for predicting the −|B Z |(=Bs) that occurs late in a MC from the early part of the MC's profile, in order to forecast D st (e.g.Wu andLepping, 2002, 2005;Wu et al., 2003).
Finally, Cane and Richardson (2003) identify 214 ICMEs from the WIND-ACE period which is of the same order as our 183 total "edited" cases of MC-like regions, but they consider a somewhat shorter period (1996)(1997)(1998)(1999)(2000)(2001)(2002) and are not using the same criteria for identification, because they are attempting to identify ICMEs, not strictly MCs, which the authors make clear.(Perhaps their 214 events should be compared to our ≈146, where we prorate 183 to 146, because of the period of 7 years considered by Cane and Richardson compared to our 8.6 years.)Even though there is some relationship (but with debatable details) between MCs and ICMEs at a given location, trying to make any unambiguous connection between the Cane and Richardson findings and ours would be difficult, especially where details in start/end times are concerned.In fact, they state the belief that MCs are a subset of ICMEs and that their relationship changes with the solar cycle.

Appendix A A scheme for quality estimation
For measurement of quality (Q 0 ) of the MC fitting (Lepping et al., 1990) we define some useful quantities (see the model fit-parameters in Sect.2.3 of the text): With these definitions we can reasonably designate "quality" in terms of Q 0 : 1 for excellent/good, 2 for fair, and 3 for poor.The values used for the discriminating features among Q 0 =1, 2, and 3 were mainly developed from experience in applying the MC model of Lepping et al. (1990).Notice that no thermodynamic properties, such as plasma beta, density, nor bulk speed, for example, are used.

Fig. 1 .
Fig. 1.Sunspot number vs. time, in monthly average form.The recent peak in sunspot number is very broad, covering approximately the years 2000-2002, and could be, in fact, a double peak.

Fig. 3 .
Fig.3.Distributions of MC Durations for 82 WIND MCs according to their quality (Q 0 ); see Appendix A for the definition of Q 0 (1=good, 2=fair, and 3=poor).The center set is for the combination of good and fair sets, the set on the right is for poor quality cases, and the set on the left is for the total number of cases.

Fig. 5 .
Fig.5.Observations of the program-identified MC of 3-4 April 1995 in terms of 30 min averages of the magnetic field (dots), in Cartesian coordinates for the top three panels, and in field magnitude (|B|), latitude (θ B ), and longitude (φ B ), for the bottom three panels -all in GSE coords.The solid black curve is theLepping et al. (1990) MC model-fit to this event which applies only within the dotted vertical lines.The list of model-associated fit parameters are given in Table3.
T h >) over the full duration of the black region must be ≤30 km/s.(The central 1/3 of the black region was chosen for a later trial, because investigation of the MFI set of MCs showed that V T h has an occasional elevation near the MC's boundaries causing the full-duration average to be an unreliable indicator of the presence of a MC.The middle 1/3 was more consistent with low values of V T h .See panel 10 of Fig.6 (V T h) for a qualitative indication of this point) 6.The latitudinal difference angle of the magnetic field, θ B (≡ (θ B )max−(θ B )min), must be ≥45 • (≡ θ B,L ), where (θ B )max and (θ B )min refer to maximum and minimum values of the latitude of the field anywhere within the black region.

Fig. 8 .
Fig. 8. (Top) Summary of percent of all MC candidate identifications for the two sets of program input parameters, i.e. those related to Strict (A) and Loose (B) criteria, in terms of agreements with visually determined MCs from WIND data (MFI set), failures, and false positives."Strict" criteria are expected to result in a smaller number of events than Loose which is the case.(C) Summary of percent of N⇒S candidate identifications for only the Loose sets of program input parameters.

"
sinβ CA V C T/2) 2 ),and where R is the MC's radius, T is the duration of MC-passage, V C is the center speed of the MC (being close to the average speed across the cloud), β CA is the angle between the MC's axis and the Sun-Earth line (where cos β CA =cos φ A cos θ A ), and Y 0 is the closest approach distance.That is, the value of the quantity "check" tests for consistency between two different means of obtaining estimates of the MC's radius, one directly from the fitting technique (R 0 ), where T was not needed, and the other (R T ) requiring duration.Other useful quantities are:ASF=|(1-2t 0 /Duration)|x 100%, (Called the asymmetry factor, where 0% is excellent), and consideration of the average field components (taken across the MC) in Cloud coordinates, <B X > Cl , <B Y > Cl , <B Z > Cl .Ideally <B X > Cl should be always positive and <B Y > Cl should be zero, because of the definition of the MC coordinate system and the fundamental field structure of the force free structure.Other factors are given below.Q 0 =3 category We determine those MCs that fall into the Q 0 =3 category first.This category arises from satisfying any one of the following: |Check|≥55%, |CA|≥97%, <B X > Cl ≤ −1.5 nT, f flag=NOT OK, Diameter ≥0.45 AU, ASF ≥40%, Cone angle (β CA )≤ 25 • or β CA ≥155 • , and χ R ≥ 0.215.The remaining cases, comprising designated "set 1,2," are examined next, in order to differentiate the best cases (Q 0 =1) from the intermediate (Q 0 =2) ones.Q 0 =1 category The Q 0 =1 cases must satisfy all of the following criteria: |check|≤20%,|<B Y > Cl | ≤3.0 nT, ASF ≤30%, 45 • ≤β CA ≤ 135 • , and χ R ≤0.165.These are the "Q 0 =1 set."Q0 =2 categoryThe remaining cases within set 1, 2, i.e. those not satisfying the Q 0 =1 criteria, are put into category Q 0 =2.

Table 2 .
Distribution of magnetic cloud type by year: WIND mission

Table 3 .
The model-associated* fit parameters for the 3-4 April 1995 Magnetic Cloud

Table 4 .
Identification test parameter values used