Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Sampling properties and empirical estimates of extreme events
AU - Mackay, E.
AU - Jonathan, P.
PY - 2021/11/1
Y1 - 2021/11/1
N2 - The statistical characteristics of the largest observations in a sample are highly uncertain. In this work we consider the problem of how to define empirical estimates of exceedance probabilities and return periods associated with an ordered sample of observations. Understanding the sampling properties of these quantities is important for assessing the fit of a statistical model and also for placing confidence bounds on estimates of extreme events from Monte Carlo simulations. The empirical distribution function (EDF) is often defined as the expected non-exceedance probability (NEP) associated with sample order statistics. Yet, due to the non-linearity of the relations between return periods, quantiles and NEP, the return period (or quantile) associated with the expected NEP is not equal to the expected return period (or quantile), leading to ambiguity. However, the sampling distributions of exceedance probabilities, return periods and quantiles are, in fact, linked by a simple relation. From this relation, it follows that defining the EDF in terms of the median NEP of the order statistics gives a consistent framework for defining empirical estimates of all three quantities. We demonstrate that the median value of the return period of the largest observation is 44% larger than the return period calculated using the common definition of the EDF in terms of the expected NEP of the order statistics. We also derive some new results about the size of the confidence intervals for exceedance probabilities and return periods.
AB - The statistical characteristics of the largest observations in a sample are highly uncertain. In this work we consider the problem of how to define empirical estimates of exceedance probabilities and return periods associated with an ordered sample of observations. Understanding the sampling properties of these quantities is important for assessing the fit of a statistical model and also for placing confidence bounds on estimates of extreme events from Monte Carlo simulations. The empirical distribution function (EDF) is often defined as the expected non-exceedance probability (NEP) associated with sample order statistics. Yet, due to the non-linearity of the relations between return periods, quantiles and NEP, the return period (or quantile) associated with the expected NEP is not equal to the expected return period (or quantile), leading to ambiguity. However, the sampling distributions of exceedance probabilities, return periods and quantiles are, in fact, linked by a simple relation. From this relation, it follows that defining the EDF in terms of the median NEP of the order statistics gives a consistent framework for defining empirical estimates of all three quantities. We demonstrate that the median value of the return period of the largest observation is 44% larger than the return period calculated using the common definition of the EDF in terms of the expected NEP of the order statistics. We also derive some new results about the size of the confidence intervals for exceedance probabilities and return periods.
KW - Confidence interval
KW - Empirical distribution function
KW - Model diagnostics
KW - Plotting position
KW - Return period
KW - Return value
KW - Sampling variability
U2 - 10.1016/j.oceaneng.2021.109791
DO - 10.1016/j.oceaneng.2021.109791
M3 - Journal article
VL - 239
JO - Ocean Engineering
JF - Ocean Engineering
SN - 0029-8018
M1 - 109791
ER -