Planning, management and e ective control of water resource systems, require a considerable amount of hydrological data variables such as rainfall, temperature, stream- ow, etc.. Such data are required when a hydrological model has to be developed, as well. Very often hydrological data sequences at a given gauge have gaps or are incomplete, or are not characterized by a good quality or are not su ciently length. This can severely a ect, for example, the reliability of the design of a hydropower plant, the construction of dams, etc. Furthermore, the problem of missing values is a common obstacle in time series analysis and speci cally in the context of rainfall, temperature and rainfall runo processes modelling. There may be various reasons for missing values, for instance equipment failure, errors in measurements or faults in data acquisition, and natural hazards such as landslides, or even temporary absence of observers, the cessation of measurement or absence of observations prior to the commencement of measurement or by limited nancial resources. Whatever the reasons, missing values produce a signi cant problem for water resources applications. Consequently, nding e cient methods to deal with the problem of missing values is an important issue in most hydrological analyses. However, hydrological modellers commonly discard the observations with missing values and only use the observations with complete information, which means that a lot of information contained in the dataset is lost. Furthermore, the approach is inadequate for analyses that require serially complete data. On the other hand, the use of the dataset prone to missing data can result in errors that exhibit temporal and spatial patterns (Stooksburry et al., 1999[Stooksbury1999]). As an alternative to this listwise(?) deletion procedure, modellers sometimes replace (or ll in ) a value for the missing values by using, for example, the mean of the observed variables. Such a procedure could, however, seriously distort statistical properties like standard variation, correlations or percentiles. But the best alternative to the above mentioned approaches consists of lling the gaps in the rainfall, temperature or stream ow time series by estimating the missing values. In fact, the most common approach followed in technical literature is the application of either deterministic or stocastic methods to estimate the missing data. The described problem is so dramatically important within hydrological research that the scienti c community has point out a transnational initiative of international research groups, i.e., the ''Decade on Prediction in Ungauged Basins (PUB)'', a wide research project promoted by the International Association of Hydrological Sciences (Sivapalan et al., 2003). This e ort is particularly focused on the reconstruction of serially incomplete data records in basins with short stream ow records or in ungauged river basins. In this scenario, the individuation and application of the most suitable methods for the accurate estimation of hydrological variable values, useful to ll in the incomplete time series, is of paramount importance and represents the most promising approach to solve the problem of missing data. In particular, once the hydrological variable is de ned together with its speci c characteristic, the choice of the estimation methods and their comparison is necessary to carry out the best reconstruction of the considered variable dataset. The issue of gaps in climatic variables have been the subject of a large number of scienti c works where numerous techniques for estimating missing data values have been implemented and compared. Among these methods, the temporal methods, that taking into account the temporal dependence of the considered variables, have been used and among more advanced methods, the space-time models, i.e. models handling dependence the spatial and temporal simultaneously, has been applied. A group of methods that are also widely used in literature for the missing data estimation are the spatial models which represent the spatial distribution of variables over a speci c duration. Many papers have been dedicated to the comparison between deterministic and stochastic approaches to reconstruct data records and their results suggest that often the use of geostatistical techniques improve the results since they are able to study the pattern of spatial dependences observed for climatic variables; in the particular case of estimation of runo , in some works it was highlighted that considering the runo as an areal process, i.e., considering the strongly dependence of runo with the basin area, improves by far the estimates obtained. Another consideration is common to many works, i.e., that the use of algorithms that incorporate ancillary informations (geographical and morphological) into the spatial estimation of climatic variables improves the obtained estimates. The aim of this thesis is to investigate the methods for the optimal estimation of the missing data in time series of hydrological variables with reference to Sicily (Italy). In particular, the following hydrological variables are object of study: precipitation, temperature and runo . In this thesis only the spatial structural dependence of rainfall, temperature and runo data is used to reconstruct missing data, neglecting the spatial-temporal dependence. On the basis of the variables speci ed, di erent estimation methods have been considered, described and applied to solve the problem of missing data. With regard to the variables as precipitation and temperature, that can be represented as point processes, the following algorithms, used for the spatial interpolation, will be applied: inverse distance weighting, radial basis function with thin plate spline, simple linear regression, multiple regression, geographically weighted regression, arti cial neural network, ordinary kriging, residual ordinary kriging. With the applications of these methods, serially complete monthly and annual dataset will be obtained. On the other hand, for the runo , the proposed investigation stems from the consideration that it can be described as an areal process. With this assumption, a more accurate estimation of the considered variable can be obtained. This approach has very few examples in scienti c literature but appears to be very promising in the considered eld. For this reason the estimation method, chosen for the runo , is a stochastic method to derive gridded maps for ner and ner resolution with a geostatistical approach. It is, in particular, a stochastic interpolation system that can be assimilated to kriging system with the explicit consideration of the runo variable as an areal process. The application of this methods will give the annual runo estimated data for the stations that have been out of work in the chosen time window of input runo data and that are characterised by a dataset a ected by missing data. Moreover, it will be possible to obtain the annual runo estimated values also for the areas of the basins not provided with gauge stations. The latter values can be obtained by the gridded map with a certain resolution. It is important to highlight that the previous applications of such an approach are done in homogeneous climatic contexts with favorable conditions of the ow regime to apply the procedure. On the contrary, here, for the rst time, the method is applied in the Sicilian context where both the climatic and morphological pro les are strongly inhomogeneous.

(2011). THE PROBLEM OF MISSING DATA IN HYDROCLIMATIC TIME SERIES. APPLICATION OF SPATIAL INTERPOLATION TECHNIQUES TO CONSTRUCT A COMPREHENSIVE ARCHIVE OF HYDROCLIMATIC DATA IN SICILY, ITALY. (Tesi di dottorato, , 2011).

THE PROBLEM OF MISSING DATA IN HYDROCLIMATIC TIME SERIES. APPLICATION OF SPATIAL INTERPOLATION TECHNIQUES TO CONSTRUCT A COMPREHENSIVE ARCHIVE OF HYDROCLIMATIC DATA IN SICILY, ITALY

DI PIAZZA, Annalisa
2011-04-01

Abstract

Planning, management and e ective control of water resource systems, require a considerable amount of hydrological data variables such as rainfall, temperature, stream- ow, etc.. Such data are required when a hydrological model has to be developed, as well. Very often hydrological data sequences at a given gauge have gaps or are incomplete, or are not characterized by a good quality or are not su ciently length. This can severely a ect, for example, the reliability of the design of a hydropower plant, the construction of dams, etc. Furthermore, the problem of missing values is a common obstacle in time series analysis and speci cally in the context of rainfall, temperature and rainfall runo processes modelling. There may be various reasons for missing values, for instance equipment failure, errors in measurements or faults in data acquisition, and natural hazards such as landslides, or even temporary absence of observers, the cessation of measurement or absence of observations prior to the commencement of measurement or by limited nancial resources. Whatever the reasons, missing values produce a signi cant problem for water resources applications. Consequently, nding e cient methods to deal with the problem of missing values is an important issue in most hydrological analyses. However, hydrological modellers commonly discard the observations with missing values and only use the observations with complete information, which means that a lot of information contained in the dataset is lost. Furthermore, the approach is inadequate for analyses that require serially complete data. On the other hand, the use of the dataset prone to missing data can result in errors that exhibit temporal and spatial patterns (Stooksburry et al., 1999[Stooksbury1999]). As an alternative to this listwise(?) deletion procedure, modellers sometimes replace (or ll in ) a value for the missing values by using, for example, the mean of the observed variables. Such a procedure could, however, seriously distort statistical properties like standard variation, correlations or percentiles. But the best alternative to the above mentioned approaches consists of lling the gaps in the rainfall, temperature or stream ow time series by estimating the missing values. In fact, the most common approach followed in technical literature is the application of either deterministic or stocastic methods to estimate the missing data. The described problem is so dramatically important within hydrological research that the scienti c community has point out a transnational initiative of international research groups, i.e., the ''Decade on Prediction in Ungauged Basins (PUB)'', a wide research project promoted by the International Association of Hydrological Sciences (Sivapalan et al., 2003). This e ort is particularly focused on the reconstruction of serially incomplete data records in basins with short stream ow records or in ungauged river basins. In this scenario, the individuation and application of the most suitable methods for the accurate estimation of hydrological variable values, useful to ll in the incomplete time series, is of paramount importance and represents the most promising approach to solve the problem of missing data. In particular, once the hydrological variable is de ned together with its speci c characteristic, the choice of the estimation methods and their comparison is necessary to carry out the best reconstruction of the considered variable dataset. The issue of gaps in climatic variables have been the subject of a large number of scienti c works where numerous techniques for estimating missing data values have been implemented and compared. Among these methods, the temporal methods, that taking into account the temporal dependence of the considered variables, have been used and among more advanced methods, the space-time models, i.e. models handling dependence the spatial and temporal simultaneously, has been applied. A group of methods that are also widely used in literature for the missing data estimation are the spatial models which represent the spatial distribution of variables over a speci c duration. Many papers have been dedicated to the comparison between deterministic and stochastic approaches to reconstruct data records and their results suggest that often the use of geostatistical techniques improve the results since they are able to study the pattern of spatial dependences observed for climatic variables; in the particular case of estimation of runo , in some works it was highlighted that considering the runo as an areal process, i.e., considering the strongly dependence of runo with the basin area, improves by far the estimates obtained. Another consideration is common to many works, i.e., that the use of algorithms that incorporate ancillary informations (geographical and morphological) into the spatial estimation of climatic variables improves the obtained estimates. The aim of this thesis is to investigate the methods for the optimal estimation of the missing data in time series of hydrological variables with reference to Sicily (Italy). In particular, the following hydrological variables are object of study: precipitation, temperature and runo . In this thesis only the spatial structural dependence of rainfall, temperature and runo data is used to reconstruct missing data, neglecting the spatial-temporal dependence. On the basis of the variables speci ed, di erent estimation methods have been considered, described and applied to solve the problem of missing data. With regard to the variables as precipitation and temperature, that can be represented as point processes, the following algorithms, used for the spatial interpolation, will be applied: inverse distance weighting, radial basis function with thin plate spline, simple linear regression, multiple regression, geographically weighted regression, arti cial neural network, ordinary kriging, residual ordinary kriging. With the applications of these methods, serially complete monthly and annual dataset will be obtained. On the other hand, for the runo , the proposed investigation stems from the consideration that it can be described as an areal process. With this assumption, a more accurate estimation of the considered variable can be obtained. This approach has very few examples in scienti c literature but appears to be very promising in the considered eld. For this reason the estimation method, chosen for the runo , is a stochastic method to derive gridded maps for ner and ner resolution with a geostatistical approach. It is, in particular, a stochastic interpolation system that can be assimilated to kriging system with the explicit consideration of the runo variable as an areal process. The application of this methods will give the annual runo estimated data for the stations that have been out of work in the chosen time window of input runo data and that are characterised by a dataset a ected by missing data. Moreover, it will be possible to obtain the annual runo estimated values also for the areas of the basins not provided with gauge stations. The latter values can be obtained by the gridded map with a certain resolution. It is important to highlight that the previous applications of such an approach are done in homogeneous climatic contexts with favorable conditions of the ow regime to apply the procedure. On the contrary, here, for the rst time, the method is applied in the Sicilian context where both the climatic and morphological pro les are strongly inhomogeneous.
HYDROCLIMATIC;
(2011). THE PROBLEM OF MISSING DATA IN HYDROCLIMATIC TIME SERIES. APPLICATION OF SPATIAL INTERPOLATION TECHNIQUES TO CONSTRUCT A COMPREHENSIVE ARCHIVE OF HYDROCLIMATIC DATA IN SICILY, ITALY. (Tesi di dottorato, , 2011).
File in questo prodotto:
File Dimensione Formato  
Tesi_dottorato_dipiazza.pdf

accesso aperto

Dimensione 22.48 MB
Formato Adobe PDF
22.48 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/95492
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact