Home > Research > Publications & Outputs > A mixed-model approach for estimating drivers o...

Links

Text available via DOI:

View graph of relations

A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance. / Sweeny, Amy R.; Lemon, Hannah; Ibrahim, Anan et al.
In: mSystems, Vol. 8, No. 4, e0004023, 31.08.2023.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Sweeny, AR, Lemon, H, Ibrahim, A, Watt, KA, Wilson, K, Childs, DZ, Nussey, DH, Free, A & McNally, L 2023, 'A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance', mSystems, vol. 8, no. 4, e0004023. https://doi.org/10.1128/msystems.00040-23

APA

Sweeny, A. R., Lemon, H., Ibrahim, A., Watt, K. A., Wilson, K., Childs, D. Z., Nussey, D. H., Free, A., & McNally, L. (2023). A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance. mSystems, 8(4), Article e0004023. https://doi.org/10.1128/msystems.00040-23

Vancouver

Sweeny AR, Lemon H, Ibrahim A, Watt KA, Wilson K, Childs DZ et al. A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance. mSystems. 2023 Aug 31;8(4):e0004023. Epub 2023 Jul 25. doi: 10.1128/msystems.00040-23

Author

Sweeny, Amy R. ; Lemon, Hannah ; Ibrahim, Anan et al. / A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance. In: mSystems. 2023 ; Vol. 8, No. 4.

Bibtex

@article{b0e3d1302b7a4ec68e6af519b9564e7e,
title = "A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance",
abstract = "Next-generation sequencing (NGS) and metabarcoding approaches are increasingly applied to wild animal populations, but there is a disconnect between the widely applied generalized linear mixed model (GLMM) approaches commonly used to study phenotypic variation and the statistical toolkit from community ecology typically applied to metabarcoding data. Here, we describe the suitability of a novel GLMM-based approach for analyzing the taxon-specific sequence read counts derived from standard metabarcoding data. This approach allows decomposition of the contribution of different drivers to variation in community composition (e.g., age, season, individual) via interaction terms in the model random-effects structure. We provide guidance to implementing this approach and show how these models can identify how responsible specific taxonomic groups are for the effects attributed to different drivers. We applied this approach to two cross-sectional data sets from the Soay sheep population of St. Kilda. GLMMs showed agreement with dissimilarity-based approaches highlighting the substantial contribution of age and minimal contribution of season to microbiota community compositions, and simultaneously estimated the contribution of other technical and biological factors. We further used model predictions to show that age effects were principally due to increases in taxa of the phylum Bacteroidetes and declines in taxa of the phylum Firmicutes. This approach offers a powerful means for understanding the influence of drivers of community structure derived from metabarcoding data. We discuss how our approach could be readily adapted to allow researchers to estimate contributions of additional factors such as host or microbe phylogeny to answer emerging questions surrounding the ecological and evolutionary roles of within-host communities. IMPORTANCE NGS and fecal metabarcoding methods have provided powerful opportunities to study the wild gut microbiome. A wealth of data is, therefore, amassing across wild systems, generating the need for analytical approaches that can appropriately investigate simultaneous factors at the host and environmental scale that determine the composition of these communities. Here, we describe a generalized linear mixed-effects model (GLMM) approach to analyze read count data from metabarcoding of the gut microbiota, allowing us to quantify the contributions of multiple host and environmental factors to within-host community structure. Our approach provides outputs that are familiar to a majority of field ecologists and can be run using any standard mixed-effects modeling packages. We illustrate this approach using two metabarcoding data sets from the Soay sheep population of St. Kilda investigating age and season effects as worked examples.",
keywords = "Computer Science Applications, Genetics, Molecular Biology, Modeling and Simulation, Ecology, Evolution, Behavior and Systematics, Biochemistry, Physiology, Microbiology",
author = "Sweeny, {Amy R.} and Hannah Lemon and Anan Ibrahim and Watt, {Kathryn A.} and Kenneth Wilson and Childs, {Dylan Z.} and Nussey, {Daniel H.} and Andrew Free and Luke McNally",
year = "2023",
month = aug,
day = "31",
doi = "10.1128/msystems.00040-23",
language = "English",
volume = "8",
journal = "mSystems",
issn = "2379-5077",
publisher = "American Society for Microbiology",
number = "4",

}

RIS

TY - JOUR

T1 - A mixed-model approach for estimating drivers of microbiota community composition and differential taxonomic abundance

AU - Sweeny, Amy R.

AU - Lemon, Hannah

AU - Ibrahim, Anan

AU - Watt, Kathryn A.

AU - Wilson, Kenneth

AU - Childs, Dylan Z.

AU - Nussey, Daniel H.

AU - Free, Andrew

AU - McNally, Luke

PY - 2023/8/31

Y1 - 2023/8/31

N2 - Next-generation sequencing (NGS) and metabarcoding approaches are increasingly applied to wild animal populations, but there is a disconnect between the widely applied generalized linear mixed model (GLMM) approaches commonly used to study phenotypic variation and the statistical toolkit from community ecology typically applied to metabarcoding data. Here, we describe the suitability of a novel GLMM-based approach for analyzing the taxon-specific sequence read counts derived from standard metabarcoding data. This approach allows decomposition of the contribution of different drivers to variation in community composition (e.g., age, season, individual) via interaction terms in the model random-effects structure. We provide guidance to implementing this approach and show how these models can identify how responsible specific taxonomic groups are for the effects attributed to different drivers. We applied this approach to two cross-sectional data sets from the Soay sheep population of St. Kilda. GLMMs showed agreement with dissimilarity-based approaches highlighting the substantial contribution of age and minimal contribution of season to microbiota community compositions, and simultaneously estimated the contribution of other technical and biological factors. We further used model predictions to show that age effects were principally due to increases in taxa of the phylum Bacteroidetes and declines in taxa of the phylum Firmicutes. This approach offers a powerful means for understanding the influence of drivers of community structure derived from metabarcoding data. We discuss how our approach could be readily adapted to allow researchers to estimate contributions of additional factors such as host or microbe phylogeny to answer emerging questions surrounding the ecological and evolutionary roles of within-host communities. IMPORTANCE NGS and fecal metabarcoding methods have provided powerful opportunities to study the wild gut microbiome. A wealth of data is, therefore, amassing across wild systems, generating the need for analytical approaches that can appropriately investigate simultaneous factors at the host and environmental scale that determine the composition of these communities. Here, we describe a generalized linear mixed-effects model (GLMM) approach to analyze read count data from metabarcoding of the gut microbiota, allowing us to quantify the contributions of multiple host and environmental factors to within-host community structure. Our approach provides outputs that are familiar to a majority of field ecologists and can be run using any standard mixed-effects modeling packages. We illustrate this approach using two metabarcoding data sets from the Soay sheep population of St. Kilda investigating age and season effects as worked examples.

AB - Next-generation sequencing (NGS) and metabarcoding approaches are increasingly applied to wild animal populations, but there is a disconnect between the widely applied generalized linear mixed model (GLMM) approaches commonly used to study phenotypic variation and the statistical toolkit from community ecology typically applied to metabarcoding data. Here, we describe the suitability of a novel GLMM-based approach for analyzing the taxon-specific sequence read counts derived from standard metabarcoding data. This approach allows decomposition of the contribution of different drivers to variation in community composition (e.g., age, season, individual) via interaction terms in the model random-effects structure. We provide guidance to implementing this approach and show how these models can identify how responsible specific taxonomic groups are for the effects attributed to different drivers. We applied this approach to two cross-sectional data sets from the Soay sheep population of St. Kilda. GLMMs showed agreement with dissimilarity-based approaches highlighting the substantial contribution of age and minimal contribution of season to microbiota community compositions, and simultaneously estimated the contribution of other technical and biological factors. We further used model predictions to show that age effects were principally due to increases in taxa of the phylum Bacteroidetes and declines in taxa of the phylum Firmicutes. This approach offers a powerful means for understanding the influence of drivers of community structure derived from metabarcoding data. We discuss how our approach could be readily adapted to allow researchers to estimate contributions of additional factors such as host or microbe phylogeny to answer emerging questions surrounding the ecological and evolutionary roles of within-host communities. IMPORTANCE NGS and fecal metabarcoding methods have provided powerful opportunities to study the wild gut microbiome. A wealth of data is, therefore, amassing across wild systems, generating the need for analytical approaches that can appropriately investigate simultaneous factors at the host and environmental scale that determine the composition of these communities. Here, we describe a generalized linear mixed-effects model (GLMM) approach to analyze read count data from metabarcoding of the gut microbiota, allowing us to quantify the contributions of multiple host and environmental factors to within-host community structure. Our approach provides outputs that are familiar to a majority of field ecologists and can be run using any standard mixed-effects modeling packages. We illustrate this approach using two metabarcoding data sets from the Soay sheep population of St. Kilda investigating age and season effects as worked examples.

KW - Computer Science Applications

KW - Genetics

KW - Molecular Biology

KW - Modeling and Simulation

KW - Ecology, Evolution, Behavior and Systematics

KW - Biochemistry

KW - Physiology

KW - Microbiology

U2 - 10.1128/msystems.00040-23

DO - 10.1128/msystems.00040-23

M3 - Journal article

C2 - 37489890

VL - 8

JO - mSystems

JF - mSystems

SN - 2379-5077

IS - 4

M1 - e0004023

ER -