Introduction
------------
This data repository contains data for the multivariate distributions
used in the paper "Scenario generation for single-period portfolio selection problems with tail risk measures: coping with high dimensions and integer variables" by Jamie Fairbrother, Amanda Turner and Stein Wallace. The distributions were fitted from
from monthly return data for stocks in the FTSE100 index.
The data is contained in the file "distribution_data.h5" which uses the binary HDF5 file format.
Structure
---------
The data file has a hierarchical format.
The top-level of this contains five groups:
- normal
- t-dist
- skew t-dist
- moments
- case study
The first four of these groups, which we call the
distribution groups, have their own
hierarchy of sub-groups, and together contain all the data
for distributions used in Section 6 of the paper.
The moments group does not specify distributions per
se, but specifies the parameters needed to construct
scenarios sets via the moment matching algorithm.
The final group, case study, contains the parameters of the skew
t-dist distribution used in Section 7 of the paper.
For each of the distribution groups, the level below
contains sub-groups which sort the distributions
by dimension:
- dim - 5
- dim - 10
- dim - 20
- dim - 30
Each dimension group in turn contains five groups,
each corresponding to a different distribution
of that dimension:
- dist 1
- dist 2
- dist 3
- dist 4
- dist 5
The exact contents of these groups depends
on the distribution.
Distribution data
-----------------
A Normal distribution has contains
two pieces of data:
- mu: mean vector
- Sigma: covariance matrix
A t-distribution has three pieces
of data:
- mu: mean vector
- Sigma: matrix specifying dependency structure
- dof: value of degrees of freedom parameter
A skew t-distribution has four pieces of data:
- xi: location vector
- Omega: matrix specifying dependency structure
- alpha: the skewness vector
- dof: value of degrees of freedom parameter
Each instance of moments has two pieces of
data:
- moments: a d x 4 matrix whose columns specify the first
four marginal moments to be matched, where d is the dimension. The first column
contains the mean, the second the variance, the third the
skewness, and the forth the excess kurtosis.
- correlations: a d x d matrix specifying pairwise correlations