Correlation studies for B-spline modeled F2 Chapman parameters obtained from FORMOSAT-3/COSMIC data

The determination of ionospheric key quantities such as the maximum electron density of the F2 layer NmF2, the corresponding F2 peak height hmF2 and the F2 scale height HF2 are of high relevance in 4-D ionosphere modeling to provide information on the vertical structure of the electron density (Ne). The Ne distribution with respect to height can, for instance, be modeled by the commonly accepted F2 Chapman layer. An adequate and observation driven description of the vertical Ne variation can be obtained from electron density profiles (EDPs) derived by ionospheric radio occultation measurements between GPS and low Earth orbiter (LEO) satellites. For these purposes, the six FORMOSAT-3/COSMIC (F3/C) satellites provide an excellent opportunity to collect EDPs that cover most of the ionospheric region, in particular the F2 layer. For the contents of this paper, F3/C EDPs have been exploited to determine NmF2, hmF2 and HF2 within a regional modeling approach. As mathematical base functions, endpointinterpolating polynomial B-splines are considered to model the key parameters with respect to longitude, latitude and time. The description of deterministic processes and the verification of this modeling approach have been published previously in Limberger et al. (2013), whereas this paper should be considered as an extension dealing with related correlation studies, a topic to which less attention has been paid in the literature. Relations between the B-spline series coefficients regarding specific key parameters as well as dependencies between the three F2 Chapman key parameters are in the main focus. Dependencies are interpreted from the postderived correlation matrices as a result of (1) a simulated scenario without data gaps by taking dense, homogenously distributed profiles into account and (2) two real data scenarios on 1 July 2008 and 1 July 2012 including sparsely, inhomogeneously distributed F3/C EDPs. Moderate correlations between hmF2 andHF2 as well as inverse correlations between NmF2 and HF2 are reflected from the simulation. By means of the real data studies, it becomes obvious that the sparse measurement distribution leads to an increased weighting of the prior information and suppresses the parameter correlations which play an important role regarding the parameter estimability. The currently implemented stochastic model is in need of improvement and does not consider stochastic correlations which consequently cannot occur.

Abstract.The determination of ionospheric key quantites such as the maximum electron density of the F2 layer N mF2, the corresponding F2 peak height hmF2 and the F2 scale height HF2 are of high relevance in 4-D ionosphere modeling to provide information on the vertical structure of the electron density (N e ).The N e distribution with respect to height can for instance be modeled by the commonly accepted F2 Chapman layer.An adequate and observation driven description of the vertical N e variation can for instance be obtained from electron density profiles (EDPs) derived by ionspheric radio occultation measurements between GPS and Low Earth Orbiter (LEO) satellites.For these purposes, the six FORMOSAT-3/COSMIC (F3/C) satellites provide an excellent opportunity to collect EDPs that cover most of the ionospheric region, in particular the F2 layer.For the contents of this paper, F3/C EDPs have been exploited to determine N mF2, hmF2 and HF2 within a regional modeling approach.As mathematical base functions, endpoint-interpolationg polynomial B-splines are considered to model the key parameters with respect to longitude, latitude and time.While the description of deterministic processes and the verification of this modeling approach have been published previously in Limberger et al. (2013), this paper should be considered as an extension dealing with related correlation studies.Relations between the B-spline series coefficients regarding specific key parameters as well as dependencies between the three F2 Chapman key parameters are in the main focus.Dependencies are interpreted from the postderived correlation matrices as a result of 1) a simulated scenario without data gaps by taking dense, homogenously distributed profiles into account and 2) two real data scenarios at 1 July 2008 and 1 July 2012 including sparsely, inhomogenously distributed F3/C EDPs.Moderate, observation-based correlations between hmF2 and HF2 as well as an inverse correlation between N mF2 and HF2 are reflected by the observation information from the simulation.By means of the real data studies, it becomes obvious that the sparse measurement distribution leads to an increased weighting of the prior information and suppresses the observation-based correlations which plays an important role regarding the parameter estimability.The currently implemented stochastic model is in need of improvement and does not consider stochastic correlations which consequently cannot occur.

Introduction
One of the major tasks in ionospheric research activities in geodesy concerns the determination of physically relevant parameters.Gaining knowledge about the physical processes leads to the opportunity to describe the ionospheric behaviour in time and space, to monitor ionospheric processes and to perform space weather studies.One of the most important items in this context is the electron density (N e ) distribution which forms the focus in several ionospheric modeling approaches these days and enables the study of ionospheric storms, ion compositions and effects on radio communications between satellites and ground receivers.The ability to describe N e variations in space and time is an important step along the way of understanding still unsolved ionospheric phenomena affected by the system of Sun-Earth interactions and to improve existing ionosphere models.Various global (e.g., Hernández-Pajares et al. (1999); Azpilicueta et al. (2006); Todorova et al. (2007)) and regional ionosphere models (e.g., Dettmering et al. (2011)) describe the spatio-temporal distribution of the vertical total electron content (VTEC).Beside novelties in the application of different mathematical and physical modeling approaches, several modeling strategies nowadays are based on the combination of different measurement techniques to compensate for the diversity in the data resolution and to achieve a best possible measurement distribution.Terrestrial permanent networks, e.g., IGS (Hernández-Pajares, 2009;Hernández-Pajares et al., 2009) or SIRGAS (Sánchez et al., 2013), have been deployed for the tracking of Global Navigation Satellite Systems (GNSS) and provide slant total electron content (STEC) data with a high spatial and temporal resolution over the continents.Complementary, VTEC over the oceans can be derived from radar altimetry (RA) along the satellite orbits.On most of the current RA missions such as Jason-2, SARAL, Cyosat and HY-2A, a DORIS (doppler orbitography and radiopositioning integrated by satellite) receiver is active which is used for orbit determination but can also be considered to derive STEC from the signals transmitted by DORIS ground beacons (Dettmering et al., 2014).The total electron content (TEC) is commonly considered as the backbone of ionosphere models, but its integral characteristic (TEC is defined as the integral of the N e along a specific signal path) makes the TEC observable insensitive for the description of the N e distribution, in particular with respect to height.On the contrary, empirical models such as NeQuick (Nava et al., 2008) or the International Reference ionosphere (Bilitza et al., 2011) provide global 4-D N e descriptions, mainly driven by ionosonde parameters such as the layer dependent critical frequencies f oE (E-layer), f oF 1 (F1-layer), f oF 2 (F2-layer) or the maximum usable frequency factor M(3000)F2 (=ratio of maximum usable frequency at a distance of 3000 km and f oF 2).Besides, several assimilation schemes of different complexity and relying on different kinds of data have been developed.For instance GAIM (Schunk et al., 2001) and EDAM (Angling, 2008).In order to obtain information of the N e distribution by satellite measurements, ionospheric GPS radio occultations (IRO), see e.g.Kirchengast et al. (2004) or Jakowski et al. (2004), can be considered as an adequate observation with height sensitivity.Relevant information of the N e can be derived e.g. from GPS signal delay changes during an occultation event caused by the atmospheric influences.IRO can be used to derive electron density profiles (EDPs) and therfore allow to model the N e distribution by physically relevant key parameters with height relation.For instance the F2 peak height hmF2 and F2 scale height HF2 play an essential role in this context and contribute to the definition of the vertical N e profile shape.Various model approaches have already been proposed and just recently published papers shall be mentioned: Altadill et al. (2012) modeled hmF2 from globally distributed ionosonde measurements by means of a spherical harmonic analytical model for quiet solar conditions, Hoque and Jakowski (2012) pro-posed the operational Neustrelitz peak height model based on 13 coefficients with only few empirically fixed parameters and driven by ionosonde data plus IRO measurements and Brunini et al. (2013) studied and compared different techniques to estimate hmF2.Moreover, Alizadeh (2013) used spherical harmonic expansions to model the N e distribution globally from the combination of different satellite based observation techniques and Limberger et al. (2013) dealt with the estimation of the three F2 Chapman parameters N mF2, hmF2 and HF2 in terms of a polynomial B-spline representation from IRO-derived electron density profiles (EDPs) which has afterwards been extended by Liang et al. (2014) for the usage of EDPs together with ground-based GPS data.It stands to reason that, depending on the choice of key parameters, the question of parameter interdepencies arises soon and shall form the emphasis of the investigations in this paper.Several studies, dealing with the issue of correlations between ionospheric parameters, are purely based on ionosonde measurements.Just to mention a few: Zhang et al. (2006) and also Liu et al. (2006) studied correlations between the ionogram derived Chapman scale height H T (or Hm, respectively) with the F2 peak parameters f 0F 2 and hmF2, the bottomside F2 layer thickness parameter B0 and the slab thickness τ ; Lee and Reinisch (2007) investigated scale height variations from Jicamarca ionograms during quiet solar conditions and analyzed thereby appearing correlations with f 0F 2, hmF2 and B0.Instead of using ionosonde measurements, a data basis consisting of EDPs observed by the low Earth orbit (LEO) mission Formosa Satellite 3 and Constellation Observing System for Meteorology, Ionosphere, and Climate (FORMOSAT-3/COSMIC, henceforth F3/C) (Rocken et al., 2000) has been established here.Investigations of parameter interdependencies on the regional scale are carried out for three F2 Chapman key parameters which are of fundamental importance in ionospheric research: The maximum electron density N mF2 of the F2 layer, the F2 peak height hmF2 and the F2 scale height HF2.These F2 Chapman parameters are in particular required for the description of the F2 layer which is at least in large part contained in the F3/C profiles -the six F3/C satellites fly in orbit heights of approximately 800 km.The modeling fundamentals leading to the determination of these key parameters have already been described in Limberger et al. (2013) and the intention of the subsequent investigations are supplementary with emphasis on correlation studies.The basic principles of the modeling concept and parameter determination will therefore be repeated in the initial Section 2. Starting with a preliminary adjustment model (Sect. 3) that is applied to a simulation scenario (Sect.4), first studies based on a dense, homogenous measurement distribution without data gaps are performed.According to the subsequent real data scenario (Sect.5), the adjustment system is extended (Sect.6) and will be applied for studies during quiet conditions at 1 July 2008 (Sect.7) and increased solar activ-ity at 1 July 2012 (Sect.8).The paper will finally end up with some final conclusions in Sect.9.

Modeling the vertical N e distribution
The description of the vertical N e distribution follows from an adapted α-Chapman function proposed by Jakowski (2005) and considers a F2 Chapman layer with an additional plasmasphere expansion given by (1) Model parameters are the maximum electron density N mF2 of the ionospheric F2 layer, the corresponding F2 peak height hmF2 and a constant F2 topside scale height HF2 which constitute the key quantities for this study.The simplifying assumption in considering a constant scale height has been proposed by Reinisch at al. (2004) and for instance considered in Stankov and Jakowski (2005) as plasma scale height in the topside ionosphere.The description of the plasmasphere follows from the basis density N0P and plasmasphere scale height HP where proportionality between NmF2 and N0P has been taken into account and HP is fixed at HP = 10 4 km (h ≥ hmF2) and HP = 10 km (h < hmF2), respectively, to allow for a smooth transition between F2-and plasmasphere layer.The model approach presented here is in particular suited for regional applications since localizing endpoint-interpolating polynomial B-splines are exploitet to model the target quantities (Schmidt, 2007).NmF2, hmF2 and HF2 are expressed as tensor products of three 1-D polynomial B-spline functions related to longitude λ, latitude ϕ and time t with unknown series coefficients.The formulation in terms of B-splines yields for hmF2 and can be accordingly obtained for NmF2 and HF2.The validity of B-splines for ionospheric VTEC modeling, in particular when dealing with inhomogenous distributed observations, has been successfully demonstrated in Schmidt et al. (2011) through comparison with spherical harmonics series expansions.Moreover, a modelling approach for regional VTEC modeling has been published by Dettmering et al. (2011) and later been enhanced by Limberger et al. (2013) and Liang et al. (2014) to determine F2 Chapman key parameters, based on Eq. (1), aiming a regional 4-D N e representation.Thus, the target quantities to be determined are the B-spline series coefficients d J λ ,Jϕ,Jt k1,k2,k3 depending on the B-spline levels J λ , J ϕ and J t which are typically adapted to the average measurement density and finally define the degree of smoothing and model resolution.The number K(J) of coefficients related to λ, ϕ and t can be calculated from K = 2 J + 2 and the total number of unknowns (for a specific key parameter) accordingly results from N u = 2 J λ + 2 Jϕ + 2 Jt + 6.In general, all series coefficients are collected in the vec- T that is defined in a Gauss-Markov model where it is important to notice that, due to the non-linearity of Eq. ( 1), a linearization is required and no absolute coefficients but corrections ∆ d for initial series coefficients d 0 have to be estimated.

Preliminary adjustment system
Taking the linearized model into account, the estimated set of coefficients results from an iterative estimation procedure via with N = A T P A and includes the design matrix A, positive definite observation weight matrix P and reduced observation vector L containing the differences between observed N e,F 3/C and initial electron densities N e (d 0 ).d 0 contains initial values for the series coefficients and is adapted after each iteration step it by the outcome of the previous iteration it − 1.Here, the initial coefficients d 0 are derived in a pre-processing step from the International Reference Ionosphere 2007 (IRI07) (Bilitza, 2000;Bilitza et al., 2011).Values for N mF2 as well as hmF2 are directly accessible and, after Wright (1960) and Davies (1990), enable the derivation of HF2 = τ /4.13 by taking the slab thickness τ = V T EC/N mF2 into account (Jayachandran et al., 2004).Based on the initial IRI07 dataset, a homogenous data grid can be established allowing for the determination of initial coefficients d 0 .
Concerning the parameter estimability, Limberger et al. (2013) showed that the presented modeling approach allows for a simultaneous determination of B-spline series coefficients regarding the three related F2 Chapman key parameters N mF2, hmF2 and HF2 by taking advantage from the sensitivity of EDPs.
In this paper, correlations between the B-spline series coefficients and also F2 Chapman key parameters are studied in detail.A reliable separablity of the key parrameters is certainly required for obtaining realistic values and asure a safe convergency of the procedure.To discuss this question, the correlations between key parameters shall be analysed on basis of the correlation matrices K xx and K tt for coefficients and target key parameters, respectively, where the latter one results from the application of the error propagation law (EPL).
The subscripts xx and tt will henceforth been used to identify the association of the matrices to the original unknowns, B-spline series coefficients or target key parameters, respectively.
The covariance matrix of the unknown series coefficients Σ xx = N −1 is the inverse of the normal equation matrix.
The diagonal elements of Σ xx are the variances σ 2 dj for j ∈ {1, . . ., N u } of the series coefficients d j .For the vectorvalued function f = F d, which is representing a set of key parameters as linear functions of the estimated B-spline series coefficients, the EPL can be applied to derive the corresponding covariance matrix Σ tt of the key parameters as Σ tt = F Σ xx F T .F is a transformation matrix and consists of partial derivatives of the key parameters with respect to the series coefficients resulting from Eq. ( 2).This means particularly that F contains the spline tensor products -for instance ∂hmF2/∂d The correlation matrices K xx and K tt with K xxr,c = Σ xxr,c /σ r σ c and K ttr,c = Σ ttr,c /σ r σ c are defined as normalized covariance matrices where r and c indicate the row and column indices here.

Simulation
The appliance of the adjustment model presented in Eq.
(3) to determine the B-spline coefficients implies the requirement of a dense measurement distribution for a scenario without data gaps.This will not be the case for a real distribution of N e profiles in the foreseeable future, but can be simulated to perform relevant studies of observationbased parameter correlations.Therefore, a 24 hour timespan at 1 July 2008 and a region of [250  ] in latitude (South America) has been selected.Homogeneously distributed EDPs with a resolution of 10 • (λ) × 10 • (ϕ) × 2 h are simulated, which is dense enough for the selected B-spline level of J λ = 2, J ϕ = 2 and J t = 3 to allow the estimation of all coefficients without any prior information.According to these levels, the total number of unknown coefficients amounts here 360 per key parameter and 1080 in total.Hence, the dimension of K xx is defined by 1080 × 1080.For the subsequent investigations, F has been computed for fixed epochs and a geographical resolution of 5 elements for a specific epoch including all three key parameters.Each simulated profile contains simulated N e observations between 100 km and 800 km height with an altitude sampling interval of 10 km.Each simulated N e observation is computed from Eq. ( 1) where the N mF2, hmF2 and also HF2 parameter values are computed from the initial coefficients d 0,N mF2 , d 0,hmF2 and d 0,HF2 .For each of these key parameter vectors, an artificial systematic bias has been introduced that is contained in the simulated observations.Since IRI07 has been used to derive d 0 , these offsets can be interpreted as corrections with respect to IRI07 that are reflected in the observations.The systematic biases then are retrieved within a closed loop procedure.2% of the mean N e per profile are considered as the standard deviation to simulate a measurement error with zero mean.The correlation matrices of the simulated scenario with respect to the coefficients and key parameters are depicted in Fig. 1 and Fig. 2. K xx contains correlations between all coefficients (i.e.  for all key parameters, geographical positions and epochs) where block submatrices are confined by black boxes and labeled with the affiliation whether referring to N mF2, hmF2, HF2 or to an off-diagonal submatrix with interparameter-coefficient correlations.All coefficients are ordered by longitude (in the innermost loop) followed by latitude (in the second loop) and time (in the outer loop).The sorting scheme will be explained afterwards in the matrix composition passage of Sect.7 and shall be by-passed at this point.Correlations are weakly visible in Fig. 1 along the diagonal, most probably related to B-spline dependencies, and in particlar between the N mF2 and HF2 coefficients.Although visible in both plots, the occurence of the correlation structure becomes clearly enhanced for K tt in Fig. 2. The small blue off-diagonal patches within each key parameter block indicate a negative correlation of the scaling coefficients for neighbouring B-splines, i.e. that the increase of a series coefficient value for a specific spline leads to the decrease of another correlated coefficient.This can be anticipated since one key parameter is always defined by three overlapping splines.Further, the colors indicate that in particular negative correlations appear between N mF2 and HF2 and, although slightly weaker, between N mF2 and hmF2.Moderate positive correlations become visible between hmF2 and HF2.Nevertheless, the correlations of nearby coefficients and parameters are the most prominent ones.The corresponding numerical values are given by Tab. 1 for the key parameters whereas, since the coefficient correlations are considerably weaker, a corresponding table for K xx is not shown here.
In the first row, Tab. 1 provides the total minimum and maximum value of K tt , afterwards detail information is listed.The grey colored lines show minimum and maximum values according to the specific submatrices with key parameter relation.Further, the correlations within the specific submatrices are assigned to intervals of [-1,0.7],]-0.7,-0.3],]-0.3,0 [, [0,0.3[, [0.3,0.7[ and [0.7,1] and are given as percentages of the number of elements within a specific submatrix.The diagonal ones are neglected from these values.All percentage values are rounded to the first digit after the decimal dot.Strong percentages of negative correlations are accentuated by blue background, percentages corresponding to positive correlations by red background.Some cells in Tab. 1 have been colored in green to highlight increased correlations beyond the ] − 0.3, 0.3[ interval.As already claimed, those correlations originating from Bsplines (appearing close to the diagonal of the K xx and K tt matrices) as a mixture of characteristic positive and negative dependencies due to the B-spline nature are very conspicuous.Further, the height dependent parameters hmF2 and HF2 show moderate positive correlations with a maximum of 0.63 are detected.4.6% of the correlations between hmF2 and HF2 are located in the [0.3,0.7[interval.This output is based on a simulated scenario and purely shows observation-based correlations between model parameters that are indirectly obtained from IRI07 without input of F3/C measurements.Since the IRI data is mainly driven by ionosondes, the results are in agreement with other correlation studies that apply ionosonde measurements.For instance with Liu et al. (2006) and Zhang et al. (2006) who stated,  close to the dip equator and also located inside our study area during quiet solar conditions and found high correlations between hmF2 and HF2.It has to be kept in mind that these correlations are latitude dependent and after Lee and Reinisch (2007), they are more obvious close to the dip equator than in low-latitudes.However, correlations between hmF2 and HF2 generally indicate that the same physical processes are influencing the hmF2 and HF2 variations, (Liu et al., 2006).
Negative correlations between N mF2 and HF2 appear in Fig. 2 where the minimum correlation yields -0.62 and 5.2% of the corresponding correlations can be found in the ]-0.7,-0.3]interval.Zhang et al. (2006) and Lee and Reinisch (2007) reported the existence of only marginal correlations between the critical frequency f oF2 and HF2 in low-latitudes.Based on the relationship N mF2 the maximum electron density is linearly dependent of f oF2 and the poor correlations between them should consequently also hold for N mF2 and HF2.
In our model, approximate values for HF2 are straightforward retrieved from τ .Correlations between the plasmaor topside scale height and τ have been found in Stankov and Jakowski (2005) by analysing CHAMP ionospheric radio occultation data and also studies of Zhang et al. (2006) with ionosonde data indeed showed very high correlations between τ and HF2 supporting the validity of the retrieval method.The correlation between VTEC and N mF2 (both quantities are used to compute τ ) is also certainly very high, after Kouris et al. ( 2008) more than 0.9, but not one.This eventually causes the remaining negative correlations between N mF2 and HF2.Liu et al. (2006) actually found weak negative correlations between f oF2 and HF2 with local time dependency from studies with ionosonde data of the Wuhan station located in low latitude at λ = 114.4• , ϕ = 30.6• .In this simulation, the preliminary model of Eq. ( 3) with the ability to handle scenarios without gaps has been considered but reality looks different and commonly the distribution of measurements is sparse and inhomogenous.Therefore, the investigation are continued now by applying F3/C EDPs.

Real data investigation outline
According to the simulated scenario described in Sect.4, the South American region covering a study area between [250  (Tsai et al., 2001;Tsai and Tsai, 2004;Tsai et al., 2009) are considered as input data.The removal of outliers and corrupted profiles by a rather conservative data screening has been accomplished within an additional pre-pocessing step.

It mainly comprises
• the detection of larger jumps, here > 50 km, in the altitudes between consecutive N e measurements of a specific EDP, • the verification of data availability around the peak region within ± 50 km to guarantee a proper description of the profile shape in the peak region, • the screening of hmF2 to be located within the physically reasonable altitude interval of [150 km, 500 km].
Further, only profiles that reach a minimum altitude below 250 km and maximum altitude above 500 km are taken into account to allow for a rather complete description of the vertical N e distribution.One drawback of the simple mathematical model given by Eq. ( 1) is the neglection of an E-layer consideration which led for some cases to mismodelling effects, see (Limberger et al., 2013).Therefore, only measurements above 150 km are considered here.Based on the retrieval theory to derive EDPs from ionospheric radio occulation measurements, the position of each N e value coincides with the closest point to Earth located on the line-of-sight between GPS and COSMIC at the corresponding observation epoch.Therefore, the time and position of the measurements can vary during an occultation even by around several minutes and up to around 10 deg in some cases.The derived profile decribes the N e distribution along a slant direction.Since this model aims the determination of F2 related parameters and benefits primarily from observations in the peak region, we assume the verticality of the profiles by considering the latitude, longitude and time of the peak measurement (=observed N mF2).Another natural effect resulting from the retrieval along the impact parameter is the availablility of much denser observations in higher altitudes.These observations, however, contain only weak information about the F2 peak quantities.In order to reduce their influence, measurements above 650 km are neglected here.Finally, a total number of 123 F3/C profiles (52% rejected) for the 1 July 2008 and 96 F3/C profiles (38% rejected) for 1 July 2012 remain after the data screening.Most of the rejected profiles have been removed because of the high measurement noise, due to incompleteness or because they are affected by a dominant E-layer influence.Especially by introducing an E-layer model, it can be expected that the number of rejections will decrease.
The distribution of all profiles for the 1 July 2008, here independent of the corresponding measurement epoch, is exemplarily depicted in Fig. 3 and already gives a first impression of the sparsity and inhomogeneity of the data.The distribution of profiles for 2012 is comparable and not shown here.It should be noted, that two precise orbit determination (POD) as well as two occultation (OCC) antennas belong to the payload of each F3/C satellite where all of them can be used for the retrieval of EDPs from radio occultation.Therefore profiles may belong to different antennas but are located approximately at the same location and time.Hence, counting the orange squares in Fig. 3 does not result in the total of 123 EDPs.And although the number of input profiles in 2008 is slightly higher than 2012, the sparcity of the distribution does not become significantly improved.
To derive a combined graphical relation between the localization of the EDPs with their corresponding measurement epoch, Fig. 3 has firstly been subdivided into 18 • × 18 • wide cells.Afterwards, the 24 h investigation timeframe has been divided in 2 h subframes and all profiles belonging to a specific spatio-temporal cell were counted.This situation is depicted in Fig. 4 as a combination of a graphical and numerical expression and gives a good impression of the data distribution.Again, as also depicted in Fig. 3  The total number of profiles measured within each cell reaches from 1 to 13 and shows that the B-spline levels with respect to longitude and latitude have to be chosen rather low to bridge those regions without data.This situation is also visible from Fig. 3 where some cells are nearly empty or at least the profiles are located close to the cell border.Therefore we decided to adapt an average spatial distance between profiles of around 18 • in latitude and longitude according to the cell width.Depending on this assessment, the relation with the sampling ∆si and boundaries si min , si max has been aplied to estimate a suitable B-spline level.Eq. ( 4) has been proposed by Schmidt et al. (2011) and leads to Hence, the B-spline level has been adapted to the situation by selecting J λ = J ϕ = 2, meaning that the number of B-splines yields K(J ϕ ) = K(J λ ) = 6 in latitude and longitude direction.
To find an adequate level for the time, all profiles belonging to a specific time segment in Fig. 4, independent of the position, have been counted and plottet in Fig. 5.It demonstrates the sparcity of the data distribution in time; for instance between [16:00, 18:00] UT in 2008 there are only three profiles available and between [02:00, 04:00] in 2012 even only one.
Based on this distribution, an approximate time sampling of 2.5 h has been chosen and, depending on Eq. ( 4), leads to a Bspline level of J t ≤ 3.1 −→ J t = 3 resulting in K(J t ) = 10 splines with respect to time.The B-spline levels found here has also been considered for the studies of 2012.

Extended adjustment system
In order to process sparse and inhomogenous data the adjustment system, preliminary introduced by Eq. ( 3), has to be extended to allow for the bridging of data gaps in a reliable way by prior information.The extended normal equations yield where M it = µ − d it−1 is computed from the difference between the prior information vector µ and the coefficient solution of the previous iteration it − 1 in order to constrain the solution in regions without observations to the prior information.Here, the prior information vector µ (just like the initial coefficients d 0 ) is derived in a pre-processing step from IRI07 which is considered as the background model but it shall be kept in mind, that the data sources for initial and prior information must not necessarily be the same.W is an extended weight matrix of the prior information defined by with block diagonal structure involving the individual weight matrices P N mF2 , P hmF2 and P HF2 of the key parameter related coefficients.Advance information on dependencies between observations or key parameters as well as knowledge allowing for a individual weighting were not available and therefore identity matrices have been considered to apply equal weights.Eq. ( 5) considers variance factors for F3/C (σ 2 F 3/C ) and the prior information groups (σ 2 N mF2 , σ 2 hmF2 , σ 2 HF2 ) by means of W .These variance components serve as relative weighting factors and are determined within a variance component estimation (VCE) (Koch, 1999;Koch and Kusche, 2002).It follows on the contrary, that the system does not account for covariance factors which means, that no correlations between different group of unknows or observations are taken into account.From the availability of a dense, homogenous measurement distribution -basically without data gaps -it can be expected, that the VCE strongly decreases the prior information since no background data is required to solve for the unknown coefficient corrections.This actually happened for the simulation in Sect. 4 where observation-based correlations occured while no prior information was involved.For a more realistic scenario the situation is typically dominated by an inhomogenous, sparse measurement distribution and the system iteratively adapts the prior information weight in order to balance with the observation weight to bridge the data gaps.This shall be kept in mind since possible information coming from the observations may be suppressed in a scenario with a relatively high prior information weight.Especially when thinking on the covariance factors, which are not considered in the stochastic model so far, the relevant observation-based correlation information cannot be retrieved in such case.Needless to say, that the modified normal equations are taken into account in such real data scenario.The resulting correlation matrices will still be denoted by K xx and K tt .For further details on the estimation procedure for coefficients and variance components it shall be referenced to Limberger et al. (2013).The modeling concept presented here is based on the idea to improve existing background information in regions where observations are available and bridge data gaps by prior information whenever it is necessary.The extend of the corrected regions depends on the chosen B-spline levels which are typically adapted to the average measurement density.

1 July 2008, Low solar activity
Starting with the 1 July 2008, the studies are subdivided into a correlation analysis on the B-spline series coefficient stage (K xx ), followed by investigations on the key parameter correlations (K tt ) and finally a passage about the matrix composition structure.

Coefficient correlations, K xx
The correlation matrix of the coefficients K xx , computed for the 1 July 2008 study, is given by Fig. 6.It has been claimed previously in Sect.6, that key parameter covariances cannot be extracted from this system since they are not considered in the stochastical model and observation-based correlations may be suppressed because of a strong prior information.This effect becomes obvious from the correlation matrix K xx in Fig. 6 by the absence of correlations between coefficients related to the different key parameters.The distribution of F3/C measurements is inhomogenous and sparse which consequently leads to an increased priori information weighting and depression of interparameter correlations.It can be vaguely discerned that only correlations originating from B-splines appear close to the diagonal of K xx as a mixture of characteristic positive and negative dependencies due to the B-spline nature.According to the approach in Sect.4, the correlation structure should be amplified by the step from K xx to K tt .

Key parameter correlations, K tt
After first investigations on the coefficient level, the correlation matrix of the key parameters K tt is obtained and illustrated by Fig. 7. K tt is given as a snapshot related to the 1 July 2008 at 12:00 UT here.Again, interparameter correlations cannot be apparent since they are not considered in the stochastical model but instead the individual parameter correlations are intensified.Eye catching blue/red colored patches along the diagonal become visible identifying correlations between nearby key parameters.The width of these patches depends on the resolution of the computed K tt matrix.Here, a spatial resolution of 5 • × 5 • degree has been chosen.The corresponding numerical information for K tt is provided in Tab. 2. Total minimum and maximum correlations are close to ±1 reflecting in particular the linear dependencies between neighbouring key parameters which are highly correlated due to the low B-spline level.This can also be interpreted from the six green colored cells that show significant positive correlations in the [0.3,0.7[ and [0.7,1] intervals of N mF2, hmF2 and HF2.

Matrix composition
The correlation matrix structure has been skipped in Sect. 4 in context of the simulation and shall be explained in detail now.Therefore, the upper triangular of the hmF2 submatrix belonging to Fig. 7 is exemplarily depicted in Fig. 8.The subsequent descriptions also hold for the N mF2 and HF2 matrices, of course.The interior order of the matrix elements, valid for K tt , is following the order of longitude in the innermost and latitude in the outer loop.Figure 7 is given as a snapshot at 12:00 UT and therefore, temporal correlations are not visible.To avoid misunderstandings, font tags have been added in Fig. 8 to describe the arrangement of the di-  agonal elements where each font tag refers to a specific latitude position.Two small triangular shaped areas have been framed to group all elements belonging to a specific latitude position.Consequently, pixels within such a triangle describe the longitude correlations of hmF2.As an example, the correlations between λ = 250 • and λ = 340 • at ϕ = −55 • has been depicted in a zoomed detail representation.Here, every pixel along the diagonal refers to a longitude position.It follows that, due to the 5 • seperation between key parameters, the diagonal consists of 19 elements.It becomes obvious that the longitude correlations are very large for side-byside elements and that interior elements show a large negative correlation indicated by the blue region.The same pattern appears multiple times in Fig. 8 and can be explained best by considering the spline distribution which is depicted in Fig. 9 as an example of latitude.With regard to the inves- tigation area covering 90 • × 90 • , it follows that the distance between neighbouring B-spline peaks is around 20 • (≈ 2000 km) in latitude and longitude direction for the defined levels J λ = J ϕ = 2.This applies at least for the "regular" splines in the interior region and is different for the two first and two last endpoint-interpolating splines.(K(J t ) = 10 splines are distributed in time on the interval of 24 h and spline peaks are placed around every 160 min, but this is of no relevance here.)Figure 9 represents the spline distribution in the latitude direction with ≈ 20 • peak separation.A single spline function, e.g. the red colored spline φ 4 2 in Fig. 9, is again illustrated in Fig. 10a and shows that the non-zero interval for a regular interior spline based on level J ϕ = 2 reaches ≈ 65 • in a total interval of a 90 • .Again, it has to be kept in mind that the boundary splines are exceptions.For this scenario, J λ and J ϕ are chosen equal and consequently the influence zone in the latitude and longitude axis becomes circular (diameter is ≈ 65 • ), indicated by the birdview perspective in Fig. 10b.Under consideration that the two interior splines cover the whole region it is easy to imagine that a negative correlation may appear between them and consequently also between key parameters located in the interior region.This relation can be seen clearly from the longitude correlations in 8.The effect, however, is not so obvious for the latitude correlations.Additionally, two black bands related to ϕ = −35 • and ϕ = 30 • are depicted in relation to the knowledge that a single spline already covers more than two third of the region and following Fig. 10a (or also Fig. 10b) approximately 65 • .The band therefore shows how the correlations evolves between directly neighbouring up to 65 • seperated parameters.A steadily decreasing trend is visible until the correlation converges to zero towards the intersection of both bands where the distance between the parameters reaches 65 • in latitude direction.The appearance of internal correlations between the variables of a specific key parameters resulting from the B-spline modeling approach are expected and not astonishing.The absence of interparameter dependencies in this scenario can be explained by studying the observation and parameter weighting resulting from the VCE.The normal equations are given by Eq. ( 5) where variance components for F3/C and the prior information, i.e. a global variance factor for each group of a priori coefficients with respect to N mF2, hmF2 and HF2, are included.All observations and the prior information are equally weighted, i.e.P and P N mF2 , P hmF2 as well as P HF2 are introduced as identity matrices, while the VCE controls the relative weighting between F3/C and the prior information.Following the investigations in Sect.5, the EDP distribution is very sparse and the adjustment system has to deal with large data gaps.Hence, the ratio between COSMIC and the prior information weighting is rather small -due to a small σ hmF2 (or as well σ N mF2 and σ HF2 ) in comparison to σ F 3/C -and consequently a relatively high prior information weighting is considered.In other words, the diagonal elements of the extended weight matrix W , given by Eq. ( 6), contribute large values which are added upon the diagonal of the normal equations matrix N in Eq. ( 5).The influence of W on the normal equations therefore is very strong and suppresses the occurence of correlations originating from the observations.Besides, the stochastic model introduced here is incomplete and currently does not account for covariances between key parameters due to the fact that only diagonal weighting matrices have been introduced.Stochastic correlations therefore cannot appear.The absence of observationbased and also stochastic key parameter correlations is confirmed by Fig. 6 and Fig. 7.

1 July 2012, Increased solar activity
To proof the general validity of the conclusions obtained for the 1 July 2008 scenario during quiet solar conditions, the same investigation procedure has been accomplished for 1 July 2012, a day with slightly higher solar activity.All settings for the studies of the 1 July 2012 are maintained as for 2008 -the configuration for the data screening, B-spline levels and computation sequence -as explained in Sect. 5 and Sect.7. A detailed description of the matrix composition for the coefficient and key parameter correlation matrices K xx and K tt , respectively, has already been given in Sect.7. The results are comparable to the outcomes of 2008 and have not manifested any differences.Studies on the coefficient level are therefore neglected here and only K tt as well as the corresponding numerical values of the correlation distribution shall be studied at this point.K tt is depicted in Fig. 11 and obviously shows a similar pattern as Fig. 7.In accor- dance to Tab. 2, a numerical expression of the correlations is provided by Tab. 3. Again, the majority of correlations is located within the ]-0.3,0[ and [0,0.3[intervals.Around 9.3% and 4.7% of the correlations for the single parameters have been detected in the [0.3,0.7[ and [0.7,0.1]intervals, respectively.The results here again confirm the absence of correla-tions between key parameters resulting from the combination of the applied model approach with a sparse and inhomogenous measurement distribution.[0,0.3[ [0.3,0.7[ [0.7, [0,0.3[ [0.3,0.7[ [0.7,1] 0.0% 0.0% 31.9%68.1% 0.0% 0.0%

Conclusions
The investigations performed in this paper are provided as an extension of Limberger et al. (2013) and were strongly focused on studies of correlations between F2 Chapman key parameters N mF2, hmF2 and HF2 based on the use of endpoint-interpolating polynomial B-splines in a regional modeling approach.A simulated as well as a real data scenario were therefore taken into account.For the latter one, one day during quiet solar conditions at 1 July 2008 and another one in the solar maximum period at 1 July 2012 have been considered.Beside studies on the dependencies between the B-spline series coefficients referring to a specific key parameter, also interdependencies between F2 Chapman key parameters have been analysed.After comparisons between the simulated scenario (dense, homogenous datagrid) and the real data scenario (sparsely, inhomogenous distributed FORMOSAT-3/COSMIC electron density profiles), it can be concluded that the chosen preliminary stochastic model supports the separability of the different types of series coefficient regarding the use of sparse and inhomogenous distributed input data.It causes effectively a decorrela-tion by suppressing observation-based key parameter correlations and enables the reliable estimation of the key parameters by increasing the prior information weight by means of the VCE as a consequence of the sparse F3/C measurement distribution.However, that there is still enough input of F3/C information to correct the background information has been shown by Limberger et al. (2013) and Liang et al. (2014).Another important point in this context concerns the incomplete stochastic model where no covariances are considered.In combination with other observation techniques it can be assumed, that the estimability benfits from the different sensitivities of the techniques, leading to a (desired) downweighting of the prior information.Consequently, the importance of considering key parameter correlations goes along with the number and distribution of further observation techniques.Since the simulated scenario with coefficient correlations reaching a minimum of -0.62 between N mF2 and HF2 (5.2% of the correlations within the ]-0.7,-0.3]interval) and maximum of 0.63 between hmF2 and HF2 (4.6% of the correlations within the [0.3,0.7[interval) still was solvable, the implementation of an extended stochastic model is not urgent but definitely an important task which has to be discussed in near future.

Fig. 1 .
Fig. 1.Correlation matrix of the coefficients Kxx for the simulated scenario.

Fig. 2 .
Fig. 2. Correlation matrices of the key parameters Ktt for the fixed time moment 12:00 UT of the simulated scenario.Ktt represents correlations for parameters related to a grid with ∆ϕ = 5 • × ∆λ = 5 • resolution.

Fig. 3 .
Fig. 3. Spatial F3/C EDP distribution observed at 1 July 2008 over the study area in South America.
, the spatial 18 • × 18 • cells are visible.Each cell contains a clock symbol, indicated by the central positioned black circle, that represents 12 hours divided in 2 hour segments.Two numerical values are referring to each 2 hour segment, a first one on the inner and a second one on the outer circle.The inner value refers to the time between 00:00 UT and 12:00 UT, numbers on the outer ring are related to a 2 h segment between 12:00 UT and 24:00 UT.Additionally, a number inside the clock on green back-

Fig. 4 .
Fig. 4. F3/C EDP distribution related to spatio-temporal grid cells observed at 1 July 2008 in the South America region.

Fig. 6 .
Fig. 6.Correlation matrix of the coefficients Kxx for the 1 July 2008.Due to the symmetry of the matrices, only those block submatrices belonging to the upper triangular structure have been depicted.

Fig. 8 .
Fig. 8. Upper triangular structure of the hmF2 block submatrix located on the diagonal of Fig. 7 showing the internal hmF2 key parameter correlations for the fixed time moment of 1 July 2008, 12:00 UT.The entries are related to a ∆ϕ = 5 • × ∆λ = 5 • grid.

Fig. 10 .
Fig. 10.Representation of a single B-spline with non-zero interval of ≈ 65 • (a) and corresponding birdview perspective giving the extend of the spline function in latitude and longitude (b).

Table 1 .
Key parameter correlations contained in Kxx for the simulated scenario at 1 July 2008.Minimum and maximum correlations of the whole matrix as well as for the key parameter affiliated submatrices are depicted in the specific headlines (grey background).Percentage values related to different correlation intervals are listed beneath where the dominating correlation intervals are indicated in blue (negative) and red (positive) colored cells.Increased correlations beyond the ] − 0.3, 0.3[ sector are accentuated by green background.

Table 2 .
Key parameter correlations contained in Ktt for 1 July 2008

Table 3 .
Key parameter correlations contained in Ktt for 1 July 2012