Home > Research > Publications & Outputs > Linking DNA Metabarcoding and Text Mining to Cr...
View graph of relations

Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter

Published

Standard

Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools : A Case Study on Boreal Wetland Macroinvertebrate Communities. / Compson, Zacchaeus G.; Monk, Wendy A.; Curry, Colin J.; Gravel, Dominique; Bush, Alex; Baker, Christopher J.O.; Al Manir, Mohammad Sadnan; Riazanov, Alexandre; Hajibabaei, Mehrdad; Shokralla, Shadi; Gibson, Joel F.; Stefani, Sonja; Wright, Michael T.G.; Baird, Donald J.

Advances in Ecological Research. ed. / David A. Bohan; Alex J. Dumbrell; Guy Woodward; Michelle Jackson. Elsevier, 2018. p. 33-74 (Advances in Ecological Research; Vol. 59).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter

Harvard

Compson, ZG, Monk, WA, Curry, CJ, Gravel, D, Bush, A, Baker, CJO, Al Manir, MS, Riazanov, A, Hajibabaei, M, Shokralla, S, Gibson, JF, Stefani, S, Wright, MTG & Baird, DJ 2018, Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities. in DA Bohan, AJ Dumbrell, G Woodward & M Jackson (eds), Advances in Ecological Research. Advances in Ecological Research, vol. 59, Elsevier, pp. 33-74. https://doi.org/10.1016/bs.aecr.2018.09.001

APA

Compson, Z. G., Monk, W. A., Curry, C. J., Gravel, D., Bush, A., Baker, C. J. O., Al Manir, M. S., Riazanov, A., Hajibabaei, M., Shokralla, S., Gibson, J. F., Stefani, S., Wright, M. T. G., & Baird, D. J. (2018). Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities. In D. A. Bohan, A. J. Dumbrell, G. Woodward, & M. Jackson (Eds.), Advances in Ecological Research (pp. 33-74). (Advances in Ecological Research; Vol. 59). Elsevier. https://doi.org/10.1016/bs.aecr.2018.09.001

Vancouver

Compson ZG, Monk WA, Curry CJ, Gravel D, Bush A, Baker CJO et al. Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities. In Bohan DA, Dumbrell AJ, Woodward G, Jackson M, editors, Advances in Ecological Research. Elsevier. 2018. p. 33-74. (Advances in Ecological Research). https://doi.org/10.1016/bs.aecr.2018.09.001

Author

Compson, Zacchaeus G. ; Monk, Wendy A. ; Curry, Colin J. ; Gravel, Dominique ; Bush, Alex ; Baker, Christopher J.O. ; Al Manir, Mohammad Sadnan ; Riazanov, Alexandre ; Hajibabaei, Mehrdad ; Shokralla, Shadi ; Gibson, Joel F. ; Stefani, Sonja ; Wright, Michael T.G. ; Baird, Donald J. / Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools : A Case Study on Boreal Wetland Macroinvertebrate Communities. Advances in Ecological Research. editor / David A. Bohan ; Alex J. Dumbrell ; Guy Woodward ; Michelle Jackson. Elsevier, 2018. pp. 33-74 (Advances in Ecological Research).

Bibtex

@inbook{56ab98b90721452b83498df2a4467554,
title = "Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools: A Case Study on Boreal Wetland Macroinvertebrate Communities",
abstract = "Ecological networks are powerful tools for visualizing biodiversity data and assessing ecosystem health and function. Constructing these networks requires considerable empirical efforts, and this remains highly challenging due to sampling limitations and the laborious and notoriously limited, error-prone process of traditional taxonomic identification. Recent advancements in high-throughput gene sequencing and high-performance computing provide new ways to address these challenges. DNA metabarcoding, a method of bulk taxonomic identification from DNA extracted from environmental samples, can generate detailed biodiversity information through a standardizable analytical pipeline for species detection. When this biodiversity information is annotated with prior knowledge on taxon interactions, body size, and trophic position, it is possible to generate trait-based networks, which we call “heuristic food webs”. Although curating trait matrices for constructing heuristic food webs is a laborious, often intractable process using manual literature surveys, it can be greatly accelerated via text mining, allowing knowledge of relevant traits to be gathered across large databases. To explore this possibility, we employed a General Architecture for Text Engineering (GATE) system to create a hybrid text-mining pipeline combining rule-based and machine-learning modules. This pipeline was then used to query online repositories of published papers for missing data on a key trait, body size, that could not be gathered from existing trophic link libraries of freshwater benthic macroinvertebrates. Combining text-mined body size information with feeding information from existing sources allowed us to generate a database of over 20,000 pairwise trophic interactions. Next, we developed a pipeline that uses taxa lists generated from DNA metabarcoding and annotates this matrix with trophic information from existing databases and text-mined body size data. In this way, we generated heuristic food webs for wetland sites within a large delta complex formed by the confluence of the Peace and Athabasca rivers in northern Alberta: the Peace–Athabasca delta. Finally, we used these putative food webs and their network properties to resolve spatial and temporal differences between the benthic subwebs of wetlands in the Peace and Athabasca sectors of the delta complex. Specifically, we asked two questions. (1) How do food web properties (e.g. number of links, linkage density, trophic height) differ between the wetlands of the Peace and Athabasca deltas? (2) How do food web properties change temporally in wetlands of the two deltas? We discuss using DNA-generated, trait-based food webs as a powerful tool for rapid bioassessment, assess the limitations of our current approach, and outline a path forward to make this powerful tool more widely available for land managers and conservation biologists.",
keywords = "Benthic macroinvertebrates, Bioassessment, Body size, DNA metabarcoding, Ecological network, Food web, Freshwater, Text mining, Traits, Trophic links",
author = "Compson, {Zacchaeus G.} and Monk, {Wendy A.} and Curry, {Colin J.} and Dominique Gravel and Alex Bush and Baker, {Christopher J.O.} and {Al Manir}, {Mohammad Sadnan} and Alexandre Riazanov and Mehrdad Hajibabaei and Shadi Shokralla and Gibson, {Joel F.} and Sonja Stefani and Wright, {Michael T.G.} and Baird, {Donald J.}",
year = "2018",
doi = "10.1016/bs.aecr.2018.09.001",
language = "English",
isbn = "9780128143179",
series = "Advances in Ecological Research",
publisher = "Elsevier",
pages = "33--74",
editor = "Bohan, {David A.} and Dumbrell, {Alex J.} and Guy Woodward and Michelle Jackson",
booktitle = "Advances in Ecological Research",

}

RIS

TY - CHAP

T1 - Linking DNA Metabarcoding and Text Mining to Create Network-Based Biomonitoring Tools

T2 - A Case Study on Boreal Wetland Macroinvertebrate Communities

AU - Compson, Zacchaeus G.

AU - Monk, Wendy A.

AU - Curry, Colin J.

AU - Gravel, Dominique

AU - Bush, Alex

AU - Baker, Christopher J.O.

AU - Al Manir, Mohammad Sadnan

AU - Riazanov, Alexandre

AU - Hajibabaei, Mehrdad

AU - Shokralla, Shadi

AU - Gibson, Joel F.

AU - Stefani, Sonja

AU - Wright, Michael T.G.

AU - Baird, Donald J.

PY - 2018

Y1 - 2018

N2 - Ecological networks are powerful tools for visualizing biodiversity data and assessing ecosystem health and function. Constructing these networks requires considerable empirical efforts, and this remains highly challenging due to sampling limitations and the laborious and notoriously limited, error-prone process of traditional taxonomic identification. Recent advancements in high-throughput gene sequencing and high-performance computing provide new ways to address these challenges. DNA metabarcoding, a method of bulk taxonomic identification from DNA extracted from environmental samples, can generate detailed biodiversity information through a standardizable analytical pipeline for species detection. When this biodiversity information is annotated with prior knowledge on taxon interactions, body size, and trophic position, it is possible to generate trait-based networks, which we call “heuristic food webs”. Although curating trait matrices for constructing heuristic food webs is a laborious, often intractable process using manual literature surveys, it can be greatly accelerated via text mining, allowing knowledge of relevant traits to be gathered across large databases. To explore this possibility, we employed a General Architecture for Text Engineering (GATE) system to create a hybrid text-mining pipeline combining rule-based and machine-learning modules. This pipeline was then used to query online repositories of published papers for missing data on a key trait, body size, that could not be gathered from existing trophic link libraries of freshwater benthic macroinvertebrates. Combining text-mined body size information with feeding information from existing sources allowed us to generate a database of over 20,000 pairwise trophic interactions. Next, we developed a pipeline that uses taxa lists generated from DNA metabarcoding and annotates this matrix with trophic information from existing databases and text-mined body size data. In this way, we generated heuristic food webs for wetland sites within a large delta complex formed by the confluence of the Peace and Athabasca rivers in northern Alberta: the Peace–Athabasca delta. Finally, we used these putative food webs and their network properties to resolve spatial and temporal differences between the benthic subwebs of wetlands in the Peace and Athabasca sectors of the delta complex. Specifically, we asked two questions. (1) How do food web properties (e.g. number of links, linkage density, trophic height) differ between the wetlands of the Peace and Athabasca deltas? (2) How do food web properties change temporally in wetlands of the two deltas? We discuss using DNA-generated, trait-based food webs as a powerful tool for rapid bioassessment, assess the limitations of our current approach, and outline a path forward to make this powerful tool more widely available for land managers and conservation biologists.

AB - Ecological networks are powerful tools for visualizing biodiversity data and assessing ecosystem health and function. Constructing these networks requires considerable empirical efforts, and this remains highly challenging due to sampling limitations and the laborious and notoriously limited, error-prone process of traditional taxonomic identification. Recent advancements in high-throughput gene sequencing and high-performance computing provide new ways to address these challenges. DNA metabarcoding, a method of bulk taxonomic identification from DNA extracted from environmental samples, can generate detailed biodiversity information through a standardizable analytical pipeline for species detection. When this biodiversity information is annotated with prior knowledge on taxon interactions, body size, and trophic position, it is possible to generate trait-based networks, which we call “heuristic food webs”. Although curating trait matrices for constructing heuristic food webs is a laborious, often intractable process using manual literature surveys, it can be greatly accelerated via text mining, allowing knowledge of relevant traits to be gathered across large databases. To explore this possibility, we employed a General Architecture for Text Engineering (GATE) system to create a hybrid text-mining pipeline combining rule-based and machine-learning modules. This pipeline was then used to query online repositories of published papers for missing data on a key trait, body size, that could not be gathered from existing trophic link libraries of freshwater benthic macroinvertebrates. Combining text-mined body size information with feeding information from existing sources allowed us to generate a database of over 20,000 pairwise trophic interactions. Next, we developed a pipeline that uses taxa lists generated from DNA metabarcoding and annotates this matrix with trophic information from existing databases and text-mined body size data. In this way, we generated heuristic food webs for wetland sites within a large delta complex formed by the confluence of the Peace and Athabasca rivers in northern Alberta: the Peace–Athabasca delta. Finally, we used these putative food webs and their network properties to resolve spatial and temporal differences between the benthic subwebs of wetlands in the Peace and Athabasca sectors of the delta complex. Specifically, we asked two questions. (1) How do food web properties (e.g. number of links, linkage density, trophic height) differ between the wetlands of the Peace and Athabasca deltas? (2) How do food web properties change temporally in wetlands of the two deltas? We discuss using DNA-generated, trait-based food webs as a powerful tool for rapid bioassessment, assess the limitations of our current approach, and outline a path forward to make this powerful tool more widely available for land managers and conservation biologists.

KW - Benthic macroinvertebrates

KW - Bioassessment

KW - Body size

KW - DNA metabarcoding

KW - Ecological network

KW - Food web

KW - Freshwater

KW - Text mining

KW - Traits

KW - Trophic links

U2 - 10.1016/bs.aecr.2018.09.001

DO - 10.1016/bs.aecr.2018.09.001

M3 - Chapter

AN - SCOPUS:85054746447

SN - 9780128143179

T3 - Advances in Ecological Research

SP - 33

EP - 74

BT - Advances in Ecological Research

A2 - Bohan, David A.

A2 - Dumbrell, Alex J.

A2 - Woodward, Guy

A2 - Jackson, Michelle

PB - Elsevier

ER -