Home > Research > Publications & Outputs > Investigation of probabilistic, deterministic a...

Electronic data

Text available via DOI:

View graph of relations

Investigation of probabilistic, deterministic and hybrid models for improved characterisation of hydrological extremes across varying temporal scales

Research output: ThesisDoctoral Thesis

Published

Standard

Harvard

APA

Vancouver

Author

Bibtex

@phdthesis{68b951b618874577a25f4c61690136b1,
title = "Investigation of probabilistic, deterministic and hybrid models for improved characterisation of hydrological extremes across varying temporal scales",
abstract = "Peak water flow events increase the risk of flooding which can have severe negative impacts on human lives and ecosystem services. Moreover, high water run-off from agricultural land increases sediment and nutrient losses that can result in soil degradation and water course pollution. In this thesis, peak flow events were modelled using statistical and machine learning approaches, process-based models (PBM) and a combination of the two. In the first thesis study, high-flow data measured over a period of 6 years (2012-2018) at the North Wyke Farm Platform, an agricultural research facility in south-west England, were characterised by the Generalised Pareto distribution (GPD). Based on the analysis of the effects of GPD parameter estimators, sample size and different temporal resolutions (15 mins, hourly, 6 hourly and daily), an automated threshold selection method based on stability plots was proposed to define peak flow events. This method was evaluated using diagnostic indices and Quantile-Quantile plots and its advantages were demonstrated. For the second study, an existing PBM (SPACSYS) was used to simulate flow at four temporal resolutions: (i) the daily resolution which is the resolution it was first developed to run at, (ii) 15 mins, (iii) hourly and (iv) 6 hourly. The simulated flow was compared to the measured values at each of the four data resolutions and also via an aggregation to the coarsest daily scale. Model performance graphics and calculated accuracy statistics showed that simulating at finer resolutions and then upscaling to the daily scale provided a more accurate representation of the number and magnitude of peak flow events. The third study, focused on improving daily PBM simulations of peak flow events by using a hybrid modelling framework where the same PBM was combined with a statistical model that stems from Extreme Value Theory (Conditional Extreme Model) and a data-driven machine learning model (Extreme Learning Machine). Assessed by goodness-of-fit indices, such as the mean absolute error (MAE), the normalized root mean square error (NRMSE), the percentage BIAS (PBIAS), the Nash-Sutcliffe efficiency (NSE), the index of agreement (d) and the Kling-Gupta Efficiency (KGE), the proposed hybrid approach was better able to capture the dynamics of the peak flow events and increase the accuracy of the predictions. For the first three studies, all methods were largely evaluated from a prediction viewpoint using error and agreement indices described above. The fourth and final thesis study, explored the use of variograms and wavelets to assess the performance of the proposed models in terms of capturing measured flow variation at different temporal scales, and in the context of peak flow events. It built on the findings from the previous studies as the hybrid model was also applied on hourly aggregated to daily PBM simulations. The use of soil moisture as a covariate was also investigated. A change point analysis found that the magnitude of the local wavelet variance was related to the frequency of peak flow events and the days before they occurred. As a whole, this thesis provides clear advances, via a series of linked studies for improved identification and characterisation of modelled peak water flows across different temporal scales.",
keywords = "Generalised Pareto distribution (GPD), Peaks over threshold, Threshold selection, Flood frequency analysis, Scale effects, Grassland agriculture, SPACSYS, Extreme flows, North Wyke Farm Platform, Grassland, Peak flows, conditional extreme model, Extreme learning machine (ELM), process-based model, Hybrid, Variogram analysis, WAVELET ANALYSIS, Process scale, Hydrology",
author = "Stelian Curceac",
year = "2022",
doi = "10.17635/lancaster/thesis/1529",
language = "English",
publisher = "Lancaster University",
school = "Lancaster University",

}

RIS

TY - BOOK

T1 - Investigation of probabilistic, deterministic and hybrid models for improved characterisation of hydrological extremes across varying temporal scales

AU - Curceac, Stelian

PY - 2022

Y1 - 2022

N2 - Peak water flow events increase the risk of flooding which can have severe negative impacts on human lives and ecosystem services. Moreover, high water run-off from agricultural land increases sediment and nutrient losses that can result in soil degradation and water course pollution. In this thesis, peak flow events were modelled using statistical and machine learning approaches, process-based models (PBM) and a combination of the two. In the first thesis study, high-flow data measured over a period of 6 years (2012-2018) at the North Wyke Farm Platform, an agricultural research facility in south-west England, were characterised by the Generalised Pareto distribution (GPD). Based on the analysis of the effects of GPD parameter estimators, sample size and different temporal resolutions (15 mins, hourly, 6 hourly and daily), an automated threshold selection method based on stability plots was proposed to define peak flow events. This method was evaluated using diagnostic indices and Quantile-Quantile plots and its advantages were demonstrated. For the second study, an existing PBM (SPACSYS) was used to simulate flow at four temporal resolutions: (i) the daily resolution which is the resolution it was first developed to run at, (ii) 15 mins, (iii) hourly and (iv) 6 hourly. The simulated flow was compared to the measured values at each of the four data resolutions and also via an aggregation to the coarsest daily scale. Model performance graphics and calculated accuracy statistics showed that simulating at finer resolutions and then upscaling to the daily scale provided a more accurate representation of the number and magnitude of peak flow events. The third study, focused on improving daily PBM simulations of peak flow events by using a hybrid modelling framework where the same PBM was combined with a statistical model that stems from Extreme Value Theory (Conditional Extreme Model) and a data-driven machine learning model (Extreme Learning Machine). Assessed by goodness-of-fit indices, such as the mean absolute error (MAE), the normalized root mean square error (NRMSE), the percentage BIAS (PBIAS), the Nash-Sutcliffe efficiency (NSE), the index of agreement (d) and the Kling-Gupta Efficiency (KGE), the proposed hybrid approach was better able to capture the dynamics of the peak flow events and increase the accuracy of the predictions. For the first three studies, all methods were largely evaluated from a prediction viewpoint using error and agreement indices described above. The fourth and final thesis study, explored the use of variograms and wavelets to assess the performance of the proposed models in terms of capturing measured flow variation at different temporal scales, and in the context of peak flow events. It built on the findings from the previous studies as the hybrid model was also applied on hourly aggregated to daily PBM simulations. The use of soil moisture as a covariate was also investigated. A change point analysis found that the magnitude of the local wavelet variance was related to the frequency of peak flow events and the days before they occurred. As a whole, this thesis provides clear advances, via a series of linked studies for improved identification and characterisation of modelled peak water flows across different temporal scales.

AB - Peak water flow events increase the risk of flooding which can have severe negative impacts on human lives and ecosystem services. Moreover, high water run-off from agricultural land increases sediment and nutrient losses that can result in soil degradation and water course pollution. In this thesis, peak flow events were modelled using statistical and machine learning approaches, process-based models (PBM) and a combination of the two. In the first thesis study, high-flow data measured over a period of 6 years (2012-2018) at the North Wyke Farm Platform, an agricultural research facility in south-west England, were characterised by the Generalised Pareto distribution (GPD). Based on the analysis of the effects of GPD parameter estimators, sample size and different temporal resolutions (15 mins, hourly, 6 hourly and daily), an automated threshold selection method based on stability plots was proposed to define peak flow events. This method was evaluated using diagnostic indices and Quantile-Quantile plots and its advantages were demonstrated. For the second study, an existing PBM (SPACSYS) was used to simulate flow at four temporal resolutions: (i) the daily resolution which is the resolution it was first developed to run at, (ii) 15 mins, (iii) hourly and (iv) 6 hourly. The simulated flow was compared to the measured values at each of the four data resolutions and also via an aggregation to the coarsest daily scale. Model performance graphics and calculated accuracy statistics showed that simulating at finer resolutions and then upscaling to the daily scale provided a more accurate representation of the number and magnitude of peak flow events. The third study, focused on improving daily PBM simulations of peak flow events by using a hybrid modelling framework where the same PBM was combined with a statistical model that stems from Extreme Value Theory (Conditional Extreme Model) and a data-driven machine learning model (Extreme Learning Machine). Assessed by goodness-of-fit indices, such as the mean absolute error (MAE), the normalized root mean square error (NRMSE), the percentage BIAS (PBIAS), the Nash-Sutcliffe efficiency (NSE), the index of agreement (d) and the Kling-Gupta Efficiency (KGE), the proposed hybrid approach was better able to capture the dynamics of the peak flow events and increase the accuracy of the predictions. For the first three studies, all methods were largely evaluated from a prediction viewpoint using error and agreement indices described above. The fourth and final thesis study, explored the use of variograms and wavelets to assess the performance of the proposed models in terms of capturing measured flow variation at different temporal scales, and in the context of peak flow events. It built on the findings from the previous studies as the hybrid model was also applied on hourly aggregated to daily PBM simulations. The use of soil moisture as a covariate was also investigated. A change point analysis found that the magnitude of the local wavelet variance was related to the frequency of peak flow events and the days before they occurred. As a whole, this thesis provides clear advances, via a series of linked studies for improved identification and characterisation of modelled peak water flows across different temporal scales.

KW - Generalised Pareto distribution (GPD)

KW - Peaks over threshold

KW - Threshold selection

KW - Flood frequency analysis

KW - Scale effects

KW - Grassland agriculture

KW - SPACSYS

KW - Extreme flows

KW - North Wyke Farm Platform

KW - Grassland

KW - Peak flows

KW - conditional extreme model

KW - Extreme learning machine (ELM)

KW - process-based model

KW - Hybrid

KW - Variogram analysis

KW - WAVELET ANALYSIS

KW - Process scale

KW - Hydrology

U2 - 10.17635/lancaster/thesis/1529

DO - 10.17635/lancaster/thesis/1529

M3 - Doctoral Thesis

PB - Lancaster University

ER -