Final published version, 13.7 MB, PDF document
Research output: Thesis › Doctoral Thesis
Research output: Thesis › Doctoral Thesis
}
TY - BOOK
T1 - Bayesian Modelling and Inference for Multiple Network Data
AU - Mantziou, Anastasia
PY - 2022
Y1 - 2022
N2 - There is a growing need for analysing network data due to their prevalence in applications arising from various scientific fields. A broad literature has been developed for the statistical analysis of networks as single observations, while the formulation of statistical frameworks for modelling multiple network data has only recently been considered by researchers. This thesis contributes to the statistical analysis of multiple network data sets, where now each observation in the data comprises a network rather than a scalar quantity. Our first contribution is the development of a Bayesian model-based approach for clustering multiple network data with respect to similarities detected in the connectivity patterns among the networks' nodes. Our model-based approach allows us to interpret the clusters with respect to a parameterisation, notably, through a network representative for each cluster. Our framework can also be formulated to detect networks in a population that are different from a majority group of networks. Extensive simulation studies show our model performs well in both clustering multiple network data and inferring the model parameters. We further apply our model on two real-world multiple network data sets resulting from the fields of Computing (Human Tracking Systems) and Neuroscience. Our second contribution is twofold. First, we introduce a new network distance metric that measures dissimilarities between networks with respect to their cycles, motivated by an ecological application. Second, we propose a new Markov Chain Monte Carlo (MCMC) scheme for inferring the parameters of the intractable Spherical Network Family (SNF) model for multiple network data. Specifically, we introduce an Importance Sampling (IS) step within a Metropolis-Hastings (MH) algorithm that allows the approximation of the intractable normalising constant of the SNF model within the MH ratio. We explore the behaviour of the newly proposed distance metric and the performance of our MCMC scheme through simulation studies, and apply our algorithm on a real-world ecological application.
AB - There is a growing need for analysing network data due to their prevalence in applications arising from various scientific fields. A broad literature has been developed for the statistical analysis of networks as single observations, while the formulation of statistical frameworks for modelling multiple network data has only recently been considered by researchers. This thesis contributes to the statistical analysis of multiple network data sets, where now each observation in the data comprises a network rather than a scalar quantity. Our first contribution is the development of a Bayesian model-based approach for clustering multiple network data with respect to similarities detected in the connectivity patterns among the networks' nodes. Our model-based approach allows us to interpret the clusters with respect to a parameterisation, notably, through a network representative for each cluster. Our framework can also be formulated to detect networks in a population that are different from a majority group of networks. Extensive simulation studies show our model performs well in both clustering multiple network data and inferring the model parameters. We further apply our model on two real-world multiple network data sets resulting from the fields of Computing (Human Tracking Systems) and Neuroscience. Our second contribution is twofold. First, we introduce a new network distance metric that measures dissimilarities between networks with respect to their cycles, motivated by an ecological application. Second, we propose a new Markov Chain Monte Carlo (MCMC) scheme for inferring the parameters of the intractable Spherical Network Family (SNF) model for multiple network data. Specifically, we introduce an Importance Sampling (IS) step within a Metropolis-Hastings (MH) algorithm that allows the approximation of the intractable normalising constant of the SNF model within the MH ratio. We explore the behaviour of the newly proposed distance metric and the performance of our MCMC scheme through simulation studies, and apply our algorithm on a real-world ecological application.
U2 - 10.17635/lancaster/thesis/1657
DO - 10.17635/lancaster/thesis/1657
M3 - Doctoral Thesis
PB - Lancaster University
ER -