Simulation of the Indian summer monsoon onset-phase rainfall using a regional model

This study examines the ability of the Advanced Research WRF (ARW) regional model to simulate Indian summer monsoon (ISM) rainfall climatology in different climate zones during the monsoon onset phase in the decade 2000–2009. The initial and boundary conditions for ARW are provided from the NCEP/NCAR Reanalysis Project (NNRP) global reanalysis. Seasonal onset-phase rainfall is compared with corresponding values from 0.25 IMD (India Meteorological Department) rainfall and NNRP precipitation data over seven climate zones (perhumid, humid, dry/moist, subhumid, dry/moist, semiarid and arid) of India to see whether dynamical downscaling using a regional model yields advantages over just using large-scale model predictions. Results show that the model could simulate the onset phase in terms of progression and distribution of rainfall in most zones (except over the northeast) with good correlations and low error metrics. The observed mean onset dates and their variability over different zones are well reproduced by the regional model over most climate zones. It has been found that the ARW performed similarly to the reanalysis in most zones and improves the onset time by 1 to 3 days in zones 4 and 7, in which the NNRP shows a delayed onset compared to the actual IMD onset times. The variations in the onset-phase rainfall during the below-normal onset (June negative) and above-normal onset (June positive) phases are well simulated. The slight underestimation of onset-phase rainfall in the northeast zone could be due to failure in resolving the wide extent of topographic variations and the associated multiscale interactions in that zone. Spatial comparisons showed improvement of pentad rainfall in both space and quantity in ARW simulations over NNRP data, as evident from a wider eastward distribution of pentad rainfall over the Western Ghats, central and eastern India, as in IMD observations. While NNRP under-represented the high pentad rainfall over northeast, east and west coast areas, the ARW captured these regional features showing improvement upon NNRP reanalysis, which may be due to the high resolution (30 km) employed. The onset-phase rainfall characteristics during the contrasting ISM of 2003 and 2009 are well simulated in terms of the variations in the strength of low-level jet (LLJ) and outgoing long-wave radiation (OLR).


Introduction
Forecasting of the Indian summer monsoon (ISM) rainfall during its different phases is important for providing agroadvisory guidance to the farming community, as 60 % of Indian agriculture is rain-fed.About 70 % of the Indian subcontinent receives 60-80 % of its annual rainfall during the southwest monsoon season between the months of June and September.Several global parameters such as El Niño-Southern Oscillation (ENSO), Indian Ocean Dipole (IOD) and regional parameters, such as the sea surface temperature, land surface temperature, snow cover, soil moisture etc., influence the ISM and cause interannual variability (Kelkar, 2009;Kripalani et al., 2004).Information on the timing of the onset phase, advancement and the associated rainfall patterns over various parts of the country is important for initiating agriculture operations in various climatic subregions.Predic-tion of the rainfall during the onset phase of ISM is important as it impacts not only agricultural farming, but also water resources and power sectors.Circulation features, like the movement of the intertropical convergence zone, the Arabian Sea and Bay of Bengal streams of monsoon current and their spread to the Indian subcontinent, need to be assessed precisely (Hastenrath, 1994;Rao, 1976;Webster et al., 1998).The onset of ISM occurs suddenly and is an indication of the commencement of the rainy season over the Indian subcontinent.During the onset of the monsoon circulation over India, dramatic changes occur in the large-scale atmospheric structure over the monsoon region (Joseph, 2012).A few of these changes include a rapid increase in daily rain rate, an increase in the columnar moisture and an increase in the strength of the low-level atmospheric flow.As for rainfall, the onset phase is marked with a sharp and sustained increase in rainfall at a cluster of stations along the coast of Kerala in southern India (Ananthakrishnan and Soman, 1989;Ananthakrishnan et al., 1967;Bhaskar Rao et al., 2008;Chakraborty et al., 2006;Goswami, 2012;Rao 1976;Soman and Kumar, 1993).Remarkable changes in regional atmospheric circulation features occur around the onset time (Ananthakrishnan and Soman, 1988;Joseph et al., 1994;Krishnamurti, 1985;Pearce and Mohanty, 1984).An important phenomenon during the onset phase is a sudden increase of kinetic energy (KE) over the low-level jet (LLJ) region (55-65 • E, 5-15 • N) which is associated with a subsequent enhancement in precipitation over the monsoon region (70-110 • E, 10-30 • N) (Goswami, 2012).The interannual variability of the onset phase is related to, among others, sea surface temperatures (SSTs) over the south tropical Indian Ocean and western equatorial Pacific (Flatau et al., 2003;Joseph et al., 1994).The onset of ISM occurs through the progression of its two currents i.e., the Arabian Sea stream which strikes the west coast in Kerala (southern tip of India), and the Bay of Bengal stream that strikes Assam and other northeastern states by 1 June.As per long-term rainfall climatology, the Arabian Sea stream arrives at the southmost peninsula by 30 May, advances northward and extends over Gujarat by 30 June.On the other hand, the Bay of Bengal stream arrives over northeast India around 2 June and advances westward, covering western Uttar Pradesh by 30 June, with the monsoon establishing over northwest India around 15 July (Hastenrath, 1994;Rao, 1976;Soman and Kumar, 1993;Tyagi et al., 2011;Webster et al., 1998).As the ISM exhibits considerable interannual variability, the actual onset dates also vary about the mean dates.
A number of statistical and synoptic methods are suggested for the prior estimation of the onset of ISM (Ananthakrishnan and Soman, 1988;Fasullo and Webster 2003;Joseph et al., 2006;Xavier et al., 2007;Wang et al., 2009) in addition to dynamic model application.In the dynamical method, atmospheric general circulation models (GCM) are used to simulate the summer monsoonal atmospheric circulation and associated rainfall.However, GCMs are limited in predicting the regional characteristics of the monsoon and its interannual variation, due to coarse resolution (Gadgil and Sajini, 1998;Gadgil et al., 2005;Kang et al., 2002;Krishnamurti et al., 2000;Krishnakumar et al., 2005;Wang et al., 2005).Regional climate models (RCMs) are used for dynamical downscaling of global model analysis/forecasts to study regional climate processes, regional climate change and its variability (Sylla et al., 2010;Gao et al., 2007;Giorgi et al., 2004;Seth et al., 2006).Application of RCMs have been proposed for better simulation of the ISM and its seasonal rainfall patterns (Bhaskaran et al., 1996) as they can effectively represent regional orography and sub-grid-scale physical processes.In recent studies, the Advanced Research Weather Research and Forecasting (ARW) regional model has been used to simulate the ISM and its rainfall climatology (Hari Prasad et al., 2011;Mukhopadhyay et al., 2010;Srinivas et al., 2014;Raju et al., 2013Raju et al., , 2014)).Srinivas et al. (2012) made seasonal-scale simulations of the ISM regional climate with ARW for the decade 2000-2009 and reported that the model reproduces the interannual variations in monsoon characteristics in terms of the pressure, temperature, winds and rainfall associated with ENSO phases and that the simulated monsoon rainfall is sensitive to the convective parameterization.The objective of the present study is to assess the performance of the ARW model in simulating the time and distribution of the ISM rainfall during the onset phase (June) through the simulation of 10 continuous ISM seasons, which included deficit, normal and surplus ISM rainfall.Specifically, the timing of the onset and distribution of rainfall over seven specified rainfall zones of India are analyzed from daily and pentad rainfall distributions.The model is run in a controlled condition (i.e., unaffected by a driving GCM bias) with real initial and boundary conditions derived from the National Center for Environmental Prediction (NCEP) global analysis.While a few studies focused on the simulation of the onset phase of ISM and their consideration was limited to 1 or 2 years (contrasting seasons), our study is designed over a period of 10 continuous ISM seasons to study whether downscaling the predictions yields advantages over just using large-scale predictions in surplus, normal and deficit ISM seasons.In the next three sections we provide the details of the model, numerical experiments conducted and the methods of analysis of simulation outputs.Results of the comparison of onset-phase seasonal rainfall over various zones for the 10-year composite 2000-2009 as well as contrasting monsoons are discussed in Sect.3.2, spatial pentad rainfall distributions are discussed in Sect.3.3 and the onset characteristics for good and bad monsoon years are discussed in Sect.3.4.Section 4 provides the summary and main conclusions of the study.

Numerical experiments
The ARW model version 3.5, developed by NCAR, USA, with a Eulerian mass dynamical core, is adapted as a regional atmospheric model and used in the present study for monsoon simulations.ARW comprises primitive equations, non-hydrostatic dynamics and terrain following mass vertical coordinates.The model is flexible to be adopted over different geographical regions of interest with nest domains and multiple physics parameterization schemes (Skamarock et al., 2008).The seasonal-scale monsoon simulations in this study are performed following the methodology given by Bhaskaran et al. (1996) for regional model integration.In this downscaling approach, the large-scale dynamics are dependent on the synoptic-scale boundary conditions provided from either analysis or global model forecasts, while the regional monsoon features, such as low pressure trough and convective systems, are simulated by the regional model at high resolution.For this study the ARW model is configured with a single domain of horizontal resolution of 30 km and 28 vertical levels and with model top at 10 hPa.The simulation domain covers the Indian monsoon region from 45 • E-109 • E in a west-east direction and 8 • S-40 • N in a northsouth direction (Fig. 1b).The selected physics options are WRF single moment 3-class (WSM3) explicit microphysics (Hong et al., 2004), Dudhia scheme (Dudhia, 1989) for shortwave radiation processes, RRTM (rapid radiation transfer model) scheme for long-wave radiation processes (Mlawer et al., 1997), the Yonsei University scheme for the boundary layer turbulence (Noh et al., 2003;Hong et al., 2006) and Noah (Chen and Dudhia, 2001) scheme for land surface processes.The Betts-Miller-Janjic (BMJ) scheme (Betts and Miller, 1986;Janjic, 2000) is used for cumulus convection following the recent sensitivity studies with ARW for ISM rainfall (Hari Prasad et al., 2011;Mukhopadhyay et al., 2010;Srinivas et al., 2012).
The initial atmospheric fields and time-varying boundary conditions are derived from the NCEP/NCAR Reanalysis Project (NNRP) global reanalysis fields (Kalnay et al., 1996) available at 2.5 • latitude/longitude resolution and at a 6 h interval.The observations assimilated in this analysis include global surface and upper air radiosonde data; surface marine observations (ships, buoys); aircraft data; surface land synoptic data; satellite sounder data, such as from the Television Infrared Observation Satellite (TIROS) Operational Vertical Sounder (TOVS), Special Sensing Microwave/Imager (SSM/I); surface wind speeds and satellitemeasured cloud motion vectors.The sea surface temperature (SST) is updated at a 6 h interval from the NCEP fields.The model topography (elevation, land cover and soil information) is defined from a 5 arc min (∼ 9 km) resolution USGS global data set.The model is integrated continuously for 2 months i.e., starting at 00:00 UTC 1 May and simulating up to 00:00 UTC 31 July of each year for the 10-year period (May-July 2000 to May-July 2009) and with periodic updating of boundary conditions from global model analysis.In this method the influence of the planetary-scale forcing is supplied via the global model analysis so that the time-averaged Walker circulation and its response to tropical Pacific SST anomalies are represented in the regional model (Bhaskaran et al., 1996;WCRP, 1992), whereas the regional Tropical Convergence Zone (TCZ) is generated internally within the model simulation.The model simulates the features of organized convection over the Indian subcontinent during the monsoon season through physics and dy-namics (surface evaporation and large-scale dynamical convergence in the monsoon trough region).The first 11-day period (1-11 May) of each simulation is considered as spinup time of the model to adjust to the model topography and other surface processes.In seasonal-scale forecasting using regional models where the boundary conditions are updated from global model forecasts, the predictions would be dependent to some extent on the driving GCM bias.As the objective of the present study is confined to the extent of assessing the performance of the regional model in simulating the characteristics of the onset-phase rainfall, in terms of magnitude and temporal and spatial variations, the use of real boundary conditions will eliminate the bias from driving global model forcing.Simulated rainfall from 12 May onwards is considered as necessary and sufficient to identify the onset dates.The gridded rainfall data available at 0.25 • resolution over the Indian region from the India Meteorological Department (IMD) (Pai et al., 2013) are used for validation of the modelderived rainfall for different climate zones.

Analysis
At first the onset times simulated by the ARW are compared with those obtained from NNRP precipitation to examine whether the model reproduces the onset timing or degrades or improves upon onset time derived from NNRP data which were used to drive the limited-area model.The actual onset dates for each zone are derived from the IMD gridded rainfall data.Analysis of model-derived rainfall and corresponding NNRP precipitation and IMD gridded daily rainfall data has been performed to derive 24 h daily rainfall (12 May to 15 July for daily rainfall) and pentad rainfall (pentads centered from 16 May to 15 July) at each of the grid points over the Indian subcontinent for the 10 years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) under study.These 10-year composites are analyzed to assess the prediction of the onset phase of ISM in terms of mean and variability of the dates of onset and error metrics for the spatial distribution.As complementary to this assessment, the rainfall during contrasting onset phases has been analyzed.The 3 years of June with above-normal rainfall (June positive; hereafter +ve) i.e., 2000, 2003 and 2007, and the 3 years with below-normal rainfall (June negative; hereafter −ve) i.e., 2004, 2006 and 2009, have been analyzed to assess the model performance during contrasting onset phases (http://www.imd.gov.in/section/nhac/dynamic/Monsoon_frame.htm).The rainfall composites are analyzed to examine whether the simulations could capture the interannual variations in the onset phase of ISM.Further, the results for 2003 and 2009 are analyzed separately as representative of normal and deficit monsoon seasons, respectively.The seasonal rainfall of the ISM in 2003 was 102 % of normal rainfall and was well distributed both in space and timing (IMD, 2004).In the year 2009, the Indian subcontinent experienced a severe monsoon drought, with the country receiving just 77 % of its normal rainfall (Preethi et al., 2011).To assess the monsoon onset rainfall in time and space, a set of seven zones of relatively homogeneous monsoon rainfall with different climatic types (Das, 1968;Rao, 1976;Mandal et al., 1999;Thornthwaite and Mather, 1955) falling in different geographic regions are considered (Fig. 1a).A description of these zones is given in our earlier work (Srinivas et al., 2012).Zone 1 is located in drysemiarid northern India, zone 2 in the arid northwest, zone 3 in dry-subhumid central India, zone 4 in moist-subhumid central northeast, zone 5 in the perhumid northeast, zone 6 along the perhumid west coast and zone 7 in the semi-arid southeast peninsular of India, respectively.The climatological mean monsoonal rainfall is 700 mm in zone 1 (Thornthwaite moisture index I m between −33.4 to −83.3), 850 mm in zone 2 (I m < −66.7), 1025 mm in zone 3 (I m in the range −33.3 to 0), 1000 mm in zone 4 (0 < I m < 20), 1450 mm in zone 5 (20 < I m < 100), 3000 mm in zone 6 (I m > 100) and 850 mm in zone 7 (−49.9< I m < −33.4) respectively (Srinivas et al., 2012).As the onset phase is characterized by a sudden increase in rainfall (Rao, 1976), pentad (5-day running total) rainfall is used to examine the advancement of the onset phase and to analyze the onset time in different zones.In this study, the IMD criteria adopted after Joseph et al. (2006) is used to derive the onset dates for each zone, with the slight modification of using pentad rainfall in place of daily rainfall.As per IMD criteria, if pentad rainfall in a zone continuously increases after 12 May and exceeds an amount of 1.5 cm or more, the onset is identified as the second day of that pentad.A close examination of the daily and pentad rainfall time series clearly revealed that pentad rainfall is more appropriate to assess the onset time of ISM as it smoothes out fluctuations over small periods, as these variations usually make the assessment difficult from the daily time series.The time series of pentad rainfall are obtained from area average rainfall computed at each time point for each zone.The model validation includes comparisons of model fields of winds, temperature and rainfall, with analysis of fields for the 10-year (2000-2009), June +ve years (2000, 2003, and 2007) and June −ve years (2004, 2006 and 2009) separate composites, and time series of daily, running pentads for the onset phase from 16 May to 15 July for different zones and for the entire monsoon region.In addition to the above composites, two contrasting monsoon seasons i.e., 2003 (normal monsoon) and 2009 (deficit monsoon) are considered separately to statistically assess the simulation of the onset-phase rainfall characteristics in the normal and deficit monsoons.
Statistical error metrics -Pearson correlation coefficient (COR), bias, mean absolute error (MAE), and root mean square error (RMSE), proposed by Murphy and Winkler (1987) -are used for quantitative comparison of pentad rainfall derived from simulations and observations.These are given below: where o i is observed pentad rainfall, f i is simulated pentad rainfall, the overbar represents spatial and temporal average over all the data and n is the sample size.The above statistics are computed for each zone considered in the analysis for each of the 10-year (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) composite, June +ve years (2000, 2003, and 2007) and June −ve years (2004, 2006 and 2009) composites, the normal monsoon year ( 2003) and the deficit monsoon year (2009).

Results and discussion
The ARW simulations are compared with the NCEP reanalysis precipitation data to test whether downscaling provides any advantages over the reanalysis with regard to the timing of the onset.Though precipitation is not used as ARW input, it is a part of the NNRP data set that is used to drive the regional model.The model relative performance for onset timing with respect to NNRP reanalysis is assessed using the actual onset dates over different climatic zones derived using the IMD gridded rainfall as reference.

Analysis of the 10-year composite
The onset of the southwest monsoon normally occurs on 1 June in Kerala and gradually progresses from southern to northern India and covers the entire country by 15 July (Fig. 1a).Hence, to assess the characteristics of the onset phase, observed and model-simulated rainfall for each year of the 10-year period from 2000-2009 are analyzed for the period 16 May to 15 July.The fluctuation of the monsoon onset date about the mean and extremes in onset timing are analyzed.All the error metrics are computed considering the time series starting from one pentad earlier than the onset pentad and ending with the last pentad i.e., the pentad for the period 11-15 July.The time series of the simulated 10-year (2000-2009) mean daily rainfall over the specified seven zones, along with corresponding NNRP and IMD observed mean daily rainfall, are considered for analysis and validation (Table 1).A sudden increase of rainfall associated with the onset is simulated in most of the zones similar to the NNRP and IMD rainfall.During the onset phase, rainfall is slightly overestimated in zones 3 and 6, considerably overestimated in zone 7, in close agreement in zone 1 and slightly underestimated in zones 2, 4 and 5, as compared to both NNRP precipitation and IMD observations.Compared with IMD data for the entire sample of 10 years between 2000 and 2009, the simulated daily rainfall during the onset phase has a smaller wet bias (0.48 to 2.63 mm day −1 ) in zones 3 and 6, a smaller dry bias (−1.04 to −1.68 mm day −1 ), except zone 5, located in northeast India and which has a relatively large dry bias (−3.47 mm day −1 ), and zone 7 in southeast India which has a larger wet bias (7.2 mm day −1 ) (Table 1).While most zones have moderate rainfall errors (RMSE < 3.0 mm day −1 ), zones 5, 6 and 7 have relatively large rainfall errors (RMSE ∼ 5 mm day −1 ), indicating model deficiency in reproducing the onset rainfall in these zones.Except for zone 7, the mean and standard deviation (SD) of the simulated rainfall for most zones are in good agreement with the corresponding values  It is important to assess the onset dates over different zones to understand the progress of the monsoon over different homogeneous rainfall regions.The time series of the running pentad rainfall i.e., cumulative rainfall in the preceding 5 days (Ananthakrishnan and Soman, 1988;Fasullo and Webster, 2003;Joseph et al., 2006;Xavier et al., 2007;Wang et  May, 13-17 May . . .and so on, with the last pentad being 11-15 July.Pentad rainfall is calculated as the area average rainfall within each zone from all three of the data sets (IMD, NNRP, ARW precipitation) (Fig. 2).During the preonset phase, some amount of rainfall is simulated and the actual onset is marked with a sudden increase in rainfall after the pre-onset phase.A threshold pentad rainfall of 1.5 cm is chosen to identify the onset time after ignoring the pre-onset rainfall in each zone used in the analysis.Dates of normal onset, mean onset and variation (SD) of onset for the 10-year period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) from the simulation, NNRP precipitation and IMD observations in different climate zones are presented in Table 2.The mean onset dates from the simulation are noted to agree with those obtained from IMD rainfall as well as NNRP precipitation with a difference of 1 to 3 days.It is seen that the variability of the long-term mean onset date from the 10-year mean from all three of the data sets (IMD, NNRP, ARW simulation) is less than 6 days over different zones.The variation in simulated onset dates followed that obtained from the NNRP precipitation field over different zones closely.The mean pentad rainfall for the 10-year period (2000 to 2009) is shown in Fig. 2 over different zones along with the corresponding pentad rainfall computed from IMD gridded rainfall data as well as NNRP reanalysis.The simulation gives the same onset dates in zones 2 and 6, a 1 to 3-day early onset in zones 1 and 7 and a 1 to 3-day delay of onset in zones 3-5, relative to the onset dates obtained from NNRP data.Similar variation in onset dates is also noticed relative to the IMD based onset dates, however the simulated onset dates are closer to those based on NNRP precipitation.
The above results of the comparison of model performance with IMD and NNRP rainfall data sets clearly show that the ARW simulation improves the onset times in zones 4 and 7, while giving similar onset timing over the rest of India as in NNRP.The 1 to 2-day early onset over south, north and central India, and 1 to 3-day delay of onset over northeast India, indicate rapid progress of the Arabian Sea stream and the gradual progress of the Bay of Bengal stream of ISM in the simulation.For zone 4, model pentad rainfall is above 1.5 cm over a few days from 16 May (Fig. 2); it then decreased till 4 June as in NNRP and IMD data, indicating pre-onset rainfall.Subsequently, pentad rainfall in this zone increased from 6 June and it exceeded 1.5 cm around 11 June, indicating the actual onset to be around 11 June, lagging by 2 days, relative to both IMD and NNRP.In zone 5, situated in a perhumid climate, the pentad rainfall is underestimated (∼ 3 cm) from 16 May till 3 June and it increased continuously after 4 June, indicating the onset around 5 June and a delay of 1 and 2 days, relative to NNRP and IMD data, respectively.In zone 6, the pentad rainfall from all three data sources i.e., IMD, NNRP and model results, indicates the end of the pre-onset phase by 30 May and onset around 1 June.The onset in zone 7 is around 7 June, indicating early onset of 2 to 3 days in this zone.Overall, the mean onset dates simulated by ARW in most zones are in good agreement with the onset time found from NNRP analysis and IMD rainfall data, except zones 3, 5 and 7, where a difference of ∼ 2 days is found.The pentad rainfall is underestimated in zone 5. Further, the variability (SD) of simulated onset dates ranges from 5.7 to 7.7 days for NNRP and 5.1 to 8.7 days for IMD rainfall, whereas the SD of onset dates varies from 4.5 to 8.5 for the simulation.The model-simulated variability is slightly larger for zones 1 and 2. The maximum variability found in onset timing is within 8 days in most zones and closely matches that obtained from NNRP.The mean onset dates and their variability indicate a reasonable simulation of the onset phase of the monsoon.The correlation and root mean square error (RMSE) computed between the time series of pentad rainfall from simulation and those based on NNRP, IMD rainfall data sets (for lead times of about 61 days or a season) are presented in Table 3 for different climate zones for the 10-year period (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009).The correlations are obtained at a 95 % significance level.Good correlations (> 0.78 with IMD; > 0.62 with NNRP) are found for pentad rain over all zones except for zone 5, which has a poor correlation of 0.38 with NNRP data.Zone 5 has wide extent of low orographic heights (∼ 1400 m) as compared to zone 6, with the Western Ghats having larger north-south (∼ 1600 km) and shorter east-west (∼ 100 km) extents and higher peaks (∼ 2700 m).Zone 5 has a highly undulating topography, with Brahmaputra Valley surrounded by relatively tall mountain ranges.The convection associated with extreme rainfall events over northeast India have been shown to be associated with multiscale (large-scale and mesoscale) circulation interaction with local topography (Goswami et al., 2010).Correlations are improved with NNRP data over most zones except zone 5.The significantly low correlation for onset-phase rainfall over the undulating terrain of zone 5 with NNRP data is attributed to the model deficiency to capture the mesoscale convective organization at 30 km resolution.Although zone 6 also has complex topography, the model could simulate the topographic effects on monsoon rainfall, leading to good correlations for rainfall.Next to zone 5, zone 7 has the lowest correlation (6.2 with NNRP; 0.75 with IMD).Zone 7 falls in the rain shadow area of the Western Ghats and records low rainfall during the summer monsoon (Rao, 1971).However, model pentad rainfall has a large wet bias in this zone.The RMSE in pentad rainfall is below 1.7 cm in most zones except zones 5 and 7, which have RMSE of more than 1.5 cm with respect to IMD pentad rainfall.The RMSE of simulated pentad rainfall is reduced by ∼ 40 % with respect to NNRP data.Similar results are found in the mean and SD of pentad rainfall derived from the three sources of data.The mean and SD of simulated pentad rainfall in most zones for the 10-year period as well as June +ve and June −ve years are closer to those of NNRP pentad rainfall (Table 3).In general, the mean and variation (SD) of pentad rainfall during above-normal onset (June +ve) years (Table 3) over different zones agree with those from observations, although the rainfall is underestimated (by ∼ 15-20 %) in zone 5.In below-normal onset years, the model pentad rainfall has slightly smaller correlations, indicating errors in timing and amount of onset rainfall which is due to smaller magnitudes of rainfall.The relatively low correlations during June −ve years are because of low rainfall during the onset phase with a slight shift in time.During the above-normal onset phase, all zones have moderate RMSE and strong correlations in pentad rainfall.Further, except for zones 5 and 7, the mean pentad rainfall errors are reasonable (∼ 2 cm) in both the normal year, 2003, and the deficit year, 2009 (Table 4), indicating reasonably good simulation of onset-phase rainfall in both the years.Zone 5 is characterized by significant orography and heavy rainfall during the onset phase of the monsoon.It has been noted that in both 2003 and 2009, except for zone 7, the ARW mean pentad rain is closer to that of NNRP data and the mean errors (MAE) in pentad rainfall are reduced when compared with NNRP, relative to IMD observations.Error statistics of zone-wise pentad rainfall indicate that the model simulates the onset rainfall better in zones 1, 2, 3, 4 and 6 as compared to the high rainfall hilly region (zone 5) and the semi-arid rain shadow in zone 7. The reason for the slight underestimation of rainfall in zone 5 seems to be the inability of the model to resolve the topographic effects and associated short-time mesoscale dynamics properly at the considered 30 km resolution.The above results show that in a forecast using ARW in the present setup, the onset time is maintained in most zones as of NNRP and improved in zones 4 and 7 over NNRP.The onset-phase rainfall analy-sis presented above clearly shows that the limited-area ARW model improves on the reanalysis in zones 1, 2, 3, 4 and 6; however, it slightly underestimated rainfall in the complex hilly region (zone 5) and overestimated the rainfall over the semi-arid zone 7. Compared with IMD data, though the onset times are not degraded in any of the zones, the errors introduced in onset-phase rainfall amount in zones 5 and 7 over NNRP precipitation data suggest the need for improvements in the model setup such as an increase in resolution to resolve the topographic effects.

Simulation during contrasting weak monsoons
For assessing the model performance     Flatau et al., 2003;Joseph et al., 2006) showed that it was a false monsoon onset associated with propagating tropical intra-seasonal disturbances/cyclones unrelated to the monsoon onset.In contrast, the monsoon in 2009 was weak, characterized by drought across all of India, and its onset occurred earlier by 1 week (onset on 23 May) (Puranik et al., 2013).The onset rainfall characteristics are presented for these two typical years (Figs. 3 and 4) for all the zones to show model performance in simulating the onset timing and progression.In the year 2002, model pentad rain is slightly overestimated in zones 1, 6 and 7 (Fig. 3).The tendencies of false onset can be noted in zone 4 in northern India and zones 6 and 7 in southern India, where pentad rainfall above 1.5 cm is noted in the first few pentads.This pre-onset rainfall can be attributed to the convection associated with the quadruplet cyclones observed in the Indian Ocean after 9 May 2002 and the subsequent depression over the Bay of Bengal (Flatau et al., 2003).Ignoring this rainfall, the time series of pentad rainfall in 2002 indicates an abrupt and sustained increase in pentad rainfall from about 1.5 cm at around 18 June in zone 1, 15 June in zone 2, 22 June in zone 3, 12 June in zone 4, 9 June in zone 5, 29 May in zone 6 and 6 June in zone 7, indicating onset of the monsoon in different zones.The pentad rainfall from NNRP analysis during the onset phase is less than the corresponding values from IMD rainfall in zones 1-3 and 6.Comparison of onset dates obtained from analyzing the ARW simulation, IMD and NNRP precipitation data sets shows that the simulation, while performing similar to the analysis in most zones, improves the onset time by 5 to 14 days in zones 1-3 in which the NNRP shows a delayed onset compared to the actual IMD onset times.In the year 2009, though the onset occurred on 23 May over southernmost parts (Kerala), its advancement was delayed by 1-2 weeks over the central and northern parts of the country (IMD, 2010).As per the northern limit of the monsoon (NLM) reported by IMD (http://www.imd.gov.in/section/nhac/dynamic/Monsoon_frame.htm), the monsoon covered the entire country slightly early in 2009.Subsequent to the onset over Kerala, the monsoon advanced over the northeastern states including West Bengal and Sikkim earlier than the norm.The monsoon advanced over the west coast and covered up to ∼ 17 • N by 7 June.A prolonged hiatus in the further advancement occurred during 8-20 June due to the weak cross-equatorial flow.This had resulted in a delay of about 2 weeks over Maharashtra, Madhya Pradesh, Uttar Pradesh, Bihar and Orissa, with actual onset dates between 26 and 29 June.Thereafter onset in west, northwest and northern parts of the country took place in a rapid manner by 30 June.In the year 2009, the pentad rainfall during the onset phase (Fig. 4) is underestimated in zones 1, 2, 5 and 7. From the pentad rainfall the onset time is indicated as 3 July in zone 1, 30 June in zone 2, 27 June in zone 3, 2 July in zone 4, 4 June in zone 5, 29 May in zone 6 and 19 June in zone 7. The ARW simulation for 2009 shows improvement of onset time by 3-6 days in zones 3 and 7 over those indicated by NNRP.Overall, the identified onset dates from ARW simulations are in good agreement with those from IMD rainfall data for all the zones, except that the pentad rainfall is slightly overestimated in different zones.Thus the model could predict the delay in the onset and normal advancement of the monsoon in 2002 and an early onset and delayed advancement of the monsoon in 2009 as in observations.In both these cases, a maximum difference of about 5 days in the mean onset date is noted between the simulation and IMD onset timing, which shows good skill.

Spatial pentad rainfall distribution
The spatial distribution of model simulated rainfall of different pentads starting from 1-5 June to 26-30 June is compared with corresponding IMD gridded rainfall data and NNRP precipitation over the Indian subcontinent.The comparison is limited to the Indian land region alone to conform to the IMD gridded rainfall data.Although this analysis produces 26 sequential pentads, only six pentads at a 5-day interval (1-5 June, 6-10 June, 11-15 June, 16-20 June, 21-25 June and 26-30 June) are presented (Fig. 5).The pentad 1-5 June represents meager rainfall (< 5 cm) distributed in few areas along the west coast, eastern and northeastern parts of India in agreement with observations, and slightly improved in the northeast and west coast compared to NNRP data.Simulated rainfall for the pentad (6-10 June) indicates increase of rainfall (5-10 cm) over Kerala-Karnataka coasts in the Western Ghats and increase of rainfall (5-10, 10-20 cm) over northeast India, indicating onset over those regions.The simula-tion shows improvement of pentad rainfall in both area and quantity over NNRP data which is evident from a wider eastward distribution of pentad rainfall over the Western Ghats, central and eastern India in both simulation and IMD observations.The simulation has missed a few isolated peaks of high pentad rainfall (10-20 cm) over the west coast and the northeast zone, which is due to underestimation of orographic rainfall in these zones (Srinivas et al., 2012).The simulated rainfall for the period (11-15 June) shows spread of rainfall (2-5 cm) into Maharashtra and Chhattisgarh states in the central peninsula and Orissa, West Bengal and Bihar states in eastern India, as in observations.While NNRP under-represented the high pentad rainfall over northeast, east and west coast areas the ARW captured these regional features showing improvement upon NNRP data, which may be due to the high resolution (30 km) employed.However, ARW simulated some spurious rainfall over southeast India which falls in the rain shadow area of the Western Ghats during the southwest monsoon.
For the period 16-20 June, simulated rainfall in the range (2-5 cm) shows increase of coverage over central, southeast and northern parts of India in agreement with both IMD and NNRP rainfall data, although Gujarat in the northwest zone is not covered (zone 2) at this time.For the period 21-25 June, the model showed further southward and northwestward extension of rainfall in the range (2-5 cm), indicating monsoon progression in these areas as in observations.The simulated rainfall for the period (26-30 June) indicates further progress of monsoon in northern India as in IMD and NNRP data; but the model failed to simulate the onset rainfall in parts of Jammu and Kashmir and Himachal Pradesh states in northern India (zone 1).Further progression of monsoon in the northwest and northern India occurred after 1 June (not shown).The gradual increase in the spatial extent of simulated rainfall from 1 June onwards as of the observed rainfall distribution clearly shows that the model could reproduce the timing of the onset phase geographically.The representation of pentad rainfall is improved in ARW simulation in quantity and area coverage over NNRP.The spatial pentads indicate slight underestimation of onset rainfall in the west coast as well as in northeastern parts of the country, which could be due to inadequate representation of orography at the moderate resolution of 30 km.Thus results of pentad-rainfall indicate that the model could simulate the progress of the monsoon during the onset phase on a seasonal time scale during 2000-2009 in most homogeneous rainfall zones.Though the onset timing in zones 5 and 6 could be reasonably well simulated, the amount of rainfall is slightly underestimated due to inadequate representation of orography at 30 km resolution.In these zones, relatively high resolution would be required to resolve the smaller spatial-scale i.e., mesoscale convective circulations for improvement of rainfall simulations.
Hence the monsoons 2003 and 2009 are considered to distinguish the onset-phase rainfall characteristics in contrasting monsoons from the simulation.The onset of the monsoon is associated with a sudden advancement in the Somali Jet/low-level jet (strong low-level winds) over the Arabian Sea and a rapid increase in the rainfall over the Indian subcontinent.They are associated with a simultaneous deepening of the monsoon trough below a 850 hPa level across north/central India and the Bay of Bengal and development of the Tibetan anticyclone in the upper troposphere (Bollasina et al., 2002;Rao, 1976)

Summary and conclusions
In this work, the ARW regional model with 30 km resolution is used to make onset-phase rainfall simulations for the decade 2000-2009.A seasonal-scale (May-July) integration starting on 1 May for each year is performed using the NCEP 2.5 • reanalysis fields for initial/boundary conditions.The model-simulated pentad rainfall during the onset phase, between 16 May and 15 July over seven representative homogeneous rainfall zones, is compared with the corresponding data from IMD 25 km gridded rainfall as well as the NNRP precipitation.The correlations and error metrics for the time series of running pentad rainfall from different rainfall zones show that the progression of rainfall during the onset phase of the monsoon could be simulated reasonably well in quantity and distribution.The model could simulate the onset-phase rainfall in most zones except the humid/perhumid high rainfall zone in the northeast.In other zones the timing of onset is simulated within a difference of about 8 days.Comparison of model performance with IMD and NNRP rainfall data sets showed that the ARW simulation improves the onset times in zones 4 and 7 over those obtained from NNRP while giving similar onset timing over the rest of India as in NNRP.
The onset-phase rainfall is reasonably simulated using the Betts-Mellor-Janjic scheme.The reduction in the bias and RMSE, improvement in correlations besides closer mean and the SD of simulated rainfall, with reference to NNRP rainfall over IMD observations, indicated that the limited-area model reproduced the onset-phase precipitation in most zones as in the driving data set, NNRP.From the onset-phase rainfall analysis it has been found that the limited-area ARW model improved upon the reanalysis in zones 1, 2, 3, 4 and 6; however, it slightly underestimated rainfall in the complex hilly region (zone 5) and overestimated the rainfall over the semi-arid rain shadow area (zone 7).The variations in the onset-phase rainfall between above-normal onset (June +ve) and below-normal onset (June −ve) over different rainfall zones could be simulated in good agreement with IMD rainfall data.Further, the model could bring out the variations in the onset-phase rainfall characteristics of normal ( 2003) and deficit (2009) years, in terms of the variation in rainfall distribution in different zones, its progression as well as the features of the strength of the Somali Jet, OLR magnitudes and atmospheric heating in the northern latitudes.The model simulated a false onset in some years (like 2002) which is due to the tropical disturbances (cyclone/depression) and the associated convection independent of actual monsoon phenomena.This requires a careful analysis in separating the relative effects and inferring the onset from model pentad rainfall.Similarly in some years the model rainfall is slightly higher during the onset phase (viz., 2009), indicating early onset away from reality.Overall, a variation of 3 to 8 days is found in the simulated onset timing from the actual observed onset in different zones for the 10-year period.The variability in onset time is relatively large (∼ 4.5 to 8 days) in zones 1, 2, 3, 4 and 7, located in north, northwest, central and southeast India where the onset characteristics are influenced by the progression of the monsoon after initial onset over Kerala.Though the simulation of onset times is not degraded in any of the zones, the large bias in onset-phase rainfall simulation in zones 5 and 7 over NNRP precipitation suggests the need for improvements in the model setup such as an increase in resolution to resolve the topographic effects.Apart from the above, the other parameters that may influence the monsoon onset simulations are surface boundary conditions like topography and sea surface temperature, which play an important role in the generation of instability and moist convection.Further, the model may also require further fine tuning with a higher resolution up to 15 km to better resolve the topography, land use, vegetation and soil physical processes that influence the land surface/boundary layer, forcing shorttime-scale mesoscale convection and regional-scale rainfall to be triggered in the simulations.It is proposed that further simulations be made with the incorporation of SST data and with the adoption of a higher resolution to reduce the uncertainty in the simulations over the southeast and northwest zones, and to address the slight dry bias in onset-phase rainfall.

Figure 1 .
Figure 1.(a) Map showing the climatic zones used for rainfall analysis and (b) simulation domain used in the ARW model along with terrain elevation (in m).

Figure 2 .
Figure 2. Time series of running pentad rainfall (cm) averaged over different rainfall zones in the period 12 May-30 June for the 10-year (2000-2009) composite, along with corresponding pentad rainfall derived from NNRP reanalysis and IMD 50 km gridded rainfall data.The first pentad is 12-16 May, the second pentad is 13-17 May . . .and the last pentad is 26-30 June.Panels (a) to (g) correspond to zone 1 to zone 7.

Figure 3 .
Figure 3.Time series of running pentad rainfall (cm) averaged over different rainfall zones in the period 12 May-30 June 2002 along with corresponding pentad rainfall derived from NNRP reanalysis and IMD 50 km gridded rainfall data.The first pentad is 12-16 May, the second pentad is 13-17 May . . .and the last pentad is 26 June-30 June.Panels (a) to (g) correspond to zone 1 to zone 7.

Figure 4 .
Figure 4. Time series of running pentad rainfall (cm) averaged over different rainfall zones in the period 12 May-30 June 2009, along with corresponding pentad rainfall derived from NNRP reanalysis and IMD 50 km gridded rainfall data.The first pentad is 12-16 May, the second pentad is 13-17 May . . .and the last pentad is 26 June-30 June.Panels (a) to (g) correspond to zone 1 to zone 7.
. A comparison of the simulated June mean low-level winds at 850 hPa between 2003 and 2009 along with corresponding fields from NCEP Final Analysis (FNL) data, available at 1 • resolution (NCEP, 2000), are presented in Fig. 6.The simulated low-level circulation is slightly stronger in 2003 relative to 2009 as in FNL analysis, especially over the Arabian Sea, suggesting weaker winds in 2009.The height of the seasonal trough at a 850 hPa level is higher (1490 m) in 2003 than (1460 m) in 2009, indicating simulation of a deeper trough in 2003.A stronger Somali Jet is simulated in 2003 relative to 2009 as in FNL data.The simulated June mean daily precipitation difference for these years along with the corresponding field from IMD 25 km gridded rainfall and NNRP reanalysis are presented in Fig. 7.The ARW simulation indicates larger area coverage of monsoon rainfall over the Indian land region in 2003 relative to 2009, with precipitation rates of 4 to 12 mm day −1 over central, northern, eastern, southeastern, west coast and northeastern areas of India in addition to low to moderate rainfall (1 to 4 mm day −1 ) areas in western, southern and northwestern parts, agreeing with IMD and NNRP rainfall.A rainfall reduction of 1 to 12 mm day −1 in the northern and central peninsula and the northeast, and about 18 mm day −1 along the west coast is found in the simulation for 2009.This indicates simulation of scanty rainfall in the onset phase of the drought year 2009.The model could simulate the general circulation features of relatively weak low-level winds, weak Somali Jet and the overall decrease of rainfall during onset of the 2009 monsoon, similar to observations and the rainfall reduction in 2009 in the northwest (by ∼ 2 mm day −1 ) and on the northern west coast (∼ 15 mm day −1 ).The simulated 5day mean precipitation over the Indian land region (8-26 • N, 66-86 • E) and over different rainfall zones for 2003 and 2009 are compared with the corresponding values from IMD rainfall observations (Fig. 8) to study the relative progression and quantity of rainfall in the above years.For the year 2003 the model shows a gradual increase in daily rainfall from 3 mm to > 40 mm as in IMD and NNRP data and with substantial rainfall amounts of 15 mm by 15 June, indicating a signif-

Figure 6 .
Figure 6.Mean winds (m s −1 ) and mean geopotential (m) for June for the years 2003 and 2009.The left panels (a, c) are from the ARW model and the right panels (b, d) are from FNL analysis.The top panels are for the year 2003 and the bottom panels for the year 2009.

Figure 11 .
Figure 11.Time-longitude cross section at 30 • N of the 500-200 hPa layer temperature (in • K) for the years 2003 (a, b) and 2009 (c, d).The left panels (a, c) represent ARW simulation and the right panels (b, d) FNL analysis.

Figure 12 .
Figure 12.Outgoing long-wave radiation (OLR) in watts m −2 from simulation and NNRP data for the years 2003 (a, b) and 2009 (c, d).The left panels (a, c) represent ARW simulation and the right panels (b, d) NNRP reanalysis.

Table 1 .
Error metrics (correlation -R, standard deviation (SD), bias and root mean square error (RMSE)) between simulated, NNRP precipitation and IMD observed daily rainfall time series from 16 May to 15 July for different zones for the composite2000-2009.

Table 2 .
Statistics of long-term normal onset dates, and mean, earliest and latest onset dates from observations and the simulation for the period 2000-2009.

Table 3 .
Error metrics (correlation coefficient -R, root mean square error (RMSE)) between pentad rainfall time series based on ARW simulation, NNRP precipitation and IMD observed rainfall in different zones from 16 May to 15 July for the 2000-2009, June −ve and June

Table 4 .
Mean absolute error (MAE) between pentad rainfall time series based on ARW simulation, NNRP precipitation and IMD observed rainfall from 16 May to 15 July for the years 2003 and 2009 for different zones.
The 500-200 hPa layer attains a local maximum heating by 12 June in 2003 and 21 June in 2009.These simulated features are found to agree well with analysis, except that the maximum heating is underestimated by 3 • C in the simulation.The outgoing longwave radiation (OLR) can be taken as a proxy parameter for convection to identify the cloud conditions over the tropical region (from deep convective clouds to a cloud-free zone using a proper threshold).The OLR derived from simulation and NNRP reanalysis data is presented in Fig.12for June.The model OLR pattern is in good agreement with the reanalysis data, although slight differences are noted in the magnitude over northwest, southwest and southeastern parts of the domain.Relative to 2009, lower OLR in the eastern, southeastern and northern parts, and a higher OLR in northwest India and adjoining western parts are found in 2003, both in simulation and analysis.The lower OLR in most parts of the country in 2003 indicates a deeper convection and higher rainfall in 2003 relative to 2009.It is noted that the OLR is underestimated (by about 25 watts m −2 ) by the model in the northwestern parts during both the years.A maximum difference of 10 to 40 watts m −2 is simulated over India between 2003 and 2009.The simulated OLR in 2003 is higher than that in 2009 in parts of northern India, northwestern India, Pakistan and Afghanistan, representing reduction of the heat low and hence a reduction in deep convection and rainfall in these regions.Similarly the simulated OLR is lower in 2003 than 2009 over the northern, eastern and southeastern parts of the country and parts of Bay of Bengal as in reanalysis data, indicating enhancement of moist convection processes and increased rainfall in 2003 in these regions.The simulation shows slightly lower OLR in the northwest zone as compared to the analysis in both the years.The OLR patterns from simulations as well as reanalysis data indicate the maximum OLR differences between 2003 and 2009 are concentrated over Pakistan and adjoining regions, indicating a reduction of rainfall in this belt in 2003.
• E) undergoes a gradual heating till 5 June and a rapid heating thereafter as in the FNL analysis.