Home > Research > Publications & Outputs > eDNAPlus: A Unifying Modeling Framework for DNA...

Links

Text available via DOI:

View graph of relations

eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring. / Diana, Alex; Matechou, Eleni; Griffin, Jim et al.
In: Journal of the American Statistical Association, Vol. 120, No. 549, 02.01.2025, p. 120-134.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Diana, A, Matechou, E, Griffin, J, Yu, DW, Luo, M, Tosa, M, Bush, A & Griffiths, R 2025, 'eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring', Journal of the American Statistical Association, vol. 120, no. 549, pp. 120-134. https://doi.org/10.1080/01621459.2024.2412362

APA

Diana, A., Matechou, E., Griffin, J., Yu, D. W., Luo, M., Tosa, M., Bush, A., & Griffiths, R. (2025). eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring. Journal of the American Statistical Association, 120(549), 120-134. https://doi.org/10.1080/01621459.2024.2412362

Vancouver

Diana A, Matechou E, Griffin J, Yu DW, Luo M, Tosa M et al. eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring. Journal of the American Statistical Association. 2025 Jan 2;120(549):120-134. Epub 2024 Oct 18. doi: 10.1080/01621459.2024.2412362

Author

Diana, Alex ; Matechou, Eleni ; Griffin, Jim et al. / eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring : A Unifying Modeling Framework for DNA-based Biodiversity Monitoring. In: Journal of the American Statistical Association. 2025 ; Vol. 120, No. 549. pp. 120-134.

Bibtex

@article{7088f473242d402587736754750be3aa,
title = "eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring",
abstract = "DNA-based biodiversity surveys, which involve collecting physical samples from survey sites and assaying them in the laboratory to detect species via their diagnostic DNA sequences, are increasingly being adopted for biodiversity monitoring and decision-making. The most commonly employed method, metabarcoding, combines PCR with high-throughput DNA sequencing to amplify and read “DNA barcode” sequences, generating count data indicating the number of times each DNA barcode was read. However, DNA-based data are noisy and error-prone, with several sources of variation, and cannot alone estimate the species-specific amount of DNA present at a surveyed site (DNA biomass). In this article, we present a unifying modeling framework for DNA-based survey data that allows estimation of changes in DNA biomass within species, across sites and their links to environmental covariates, while for the first time simultaneously accounting for key sources of variation, error and noise in the data-generating process, and for between-species and between-sites correlation. Bayesian inference is performed using MCMC with Laplace approximations. We describe a re-parameterization scheme for crossed-effects models designed to improve mixing, and an adaptive approach for updating latent variables, which reduces computation time. Theoretical and simulation results are used to guide study design, including the level of replication at different survey stages and the use of quality control methods. Finally, we demonstrate our new framework on a dataset of Malaise-trap samples, quantifying the effects of elevation and distance-to-road on each species, and produce maps identifying areas of high biodiversity and species DNA biomass. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.",
keywords = "Crossed-effects model, Environmental DNA, Joint species distribution modeling, Observation error, Occupancy modeling",
author = "Alex Diana and Eleni Matechou and Jim Griffin and Yu, {Douglas W.} and Mingjie Luo and Marie Tosa and Alex Bush and Richard Griffiths",
year = "2025",
month = jan,
day = "2",
doi = "10.1080/01621459.2024.2412362",
language = "English",
volume = "120",
pages = "120--134",
journal = "Journal of the American Statistical Association",
issn = "0162-1459",
publisher = "Taylor and Francis Ltd.",
number = "549",

}

RIS

TY - JOUR

T1 - eDNAPlus: A Unifying Modeling Framework for DNA-based Biodiversity Monitoring

T2 - A Unifying Modeling Framework for DNA-based Biodiversity Monitoring

AU - Diana, Alex

AU - Matechou, Eleni

AU - Griffin, Jim

AU - Yu, Douglas W.

AU - Luo, Mingjie

AU - Tosa, Marie

AU - Bush, Alex

AU - Griffiths, Richard

PY - 2025/1/2

Y1 - 2025/1/2

N2 - DNA-based biodiversity surveys, which involve collecting physical samples from survey sites and assaying them in the laboratory to detect species via their diagnostic DNA sequences, are increasingly being adopted for biodiversity monitoring and decision-making. The most commonly employed method, metabarcoding, combines PCR with high-throughput DNA sequencing to amplify and read “DNA barcode” sequences, generating count data indicating the number of times each DNA barcode was read. However, DNA-based data are noisy and error-prone, with several sources of variation, and cannot alone estimate the species-specific amount of DNA present at a surveyed site (DNA biomass). In this article, we present a unifying modeling framework for DNA-based survey data that allows estimation of changes in DNA biomass within species, across sites and their links to environmental covariates, while for the first time simultaneously accounting for key sources of variation, error and noise in the data-generating process, and for between-species and between-sites correlation. Bayesian inference is performed using MCMC with Laplace approximations. We describe a re-parameterization scheme for crossed-effects models designed to improve mixing, and an adaptive approach for updating latent variables, which reduces computation time. Theoretical and simulation results are used to guide study design, including the level of replication at different survey stages and the use of quality control methods. Finally, we demonstrate our new framework on a dataset of Malaise-trap samples, quantifying the effects of elevation and distance-to-road on each species, and produce maps identifying areas of high biodiversity and species DNA biomass. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

AB - DNA-based biodiversity surveys, which involve collecting physical samples from survey sites and assaying them in the laboratory to detect species via their diagnostic DNA sequences, are increasingly being adopted for biodiversity monitoring and decision-making. The most commonly employed method, metabarcoding, combines PCR with high-throughput DNA sequencing to amplify and read “DNA barcode” sequences, generating count data indicating the number of times each DNA barcode was read. However, DNA-based data are noisy and error-prone, with several sources of variation, and cannot alone estimate the species-specific amount of DNA present at a surveyed site (DNA biomass). In this article, we present a unifying modeling framework for DNA-based survey data that allows estimation of changes in DNA biomass within species, across sites and their links to environmental covariates, while for the first time simultaneously accounting for key sources of variation, error and noise in the data-generating process, and for between-species and between-sites correlation. Bayesian inference is performed using MCMC with Laplace approximations. We describe a re-parameterization scheme for crossed-effects models designed to improve mixing, and an adaptive approach for updating latent variables, which reduces computation time. Theoretical and simulation results are used to guide study design, including the level of replication at different survey stages and the use of quality control methods. Finally, we demonstrate our new framework on a dataset of Malaise-trap samples, quantifying the effects of elevation and distance-to-road on each species, and produce maps identifying areas of high biodiversity and species DNA biomass. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

KW - Crossed-effects model

KW - Environmental DNA

KW - Joint species distribution modeling

KW - Observation error

KW - Occupancy modeling

UR - http://www.scopus.com/inward/record.url?scp=85212852541&partnerID=8YFLogxK

U2 - 10.1080/01621459.2024.2412362

DO - 10.1080/01621459.2024.2412362

M3 - Journal article

VL - 120

SP - 120

EP - 134

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

SN - 0162-1459

IS - 549

ER -