Home > Research > Publications & Outputs > Tempus et Locus: a tool for extracting precisel...

Electronic data

  • 061697.full

    Submitted manuscript, 541 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Tempus et Locus: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V)

Research output: Contribution to Journal/MagazineJournal article

Published

Standard

Tempus et Locus: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V). / Carter, Alice R.; Gatherer, Derek.
In: Biorxiv, Vol. 2016, 04.07.2016.

Research output: Contribution to Journal/MagazineJournal article

Harvard

APA

Vancouver

Author

Bibtex

@article{814d70aa531a447cac6751919d42a54f,
title = "Tempus et Locus: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V)",
abstract = "The presence of data in the collection_date field of a GenBank sequence record is of great assistance in the use of that sequence for Bayesian phylogenetics using tip-dating. We present Tempus et Locus (TeL), a tool for extracting such sequences from a GenBank-formatted sequence database. TeL shows that 60% of viral sequences in GenBank have collection date fields, but that this varies considerably between species. Primate erythroparvovirus 1 (human parvovirus B19 or B19V) has only 40% of its sequences dated, of which only 112 are of more than 4 kb. 100 of these are from B19V sub-genotype 1a and were collected from a mere 6 studies conducted in 5 countries between 2002 and 2013. Nevertheless, Bayesian phylogenetic analysis of this limited set gives a date for the common ancestor of sub-genotype 1a in 1990 (95% HPD 1981-1996) which is in reasonable agreement with estimates of previous studies where collection dates have been assembled by more laborious methods of literature search and direct enquiries to sequence submitters. We conclude that although collection dates should become standard for all future GenBank submissions of virus sequences, accurate dating of ancestors is possible with even a small number of sequences if sampling information is high quality.",
keywords = "primate erythroparvovirus , parvovirus B19, Parvoviridae, phylogenetics, Tempus et Locus, TeL, virus, evolution, bioinformatics",
author = "Carter, {Alice R.} and Derek Gatherer",
year = "2016",
month = jul,
day = "4",
doi = "10.1101/061697",
language = "English",
volume = "2016",
journal = "Biorxiv",
publisher = "Cold Spring Harbor Laboratory Press",

}

RIS

TY - JOUR

T1 - Tempus et Locus: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V)

AU - Carter, Alice R.

AU - Gatherer, Derek

PY - 2016/7/4

Y1 - 2016/7/4

N2 - The presence of data in the collection_date field of a GenBank sequence record is of great assistance in the use of that sequence for Bayesian phylogenetics using tip-dating. We present Tempus et Locus (TeL), a tool for extracting such sequences from a GenBank-formatted sequence database. TeL shows that 60% of viral sequences in GenBank have collection date fields, but that this varies considerably between species. Primate erythroparvovirus 1 (human parvovirus B19 or B19V) has only 40% of its sequences dated, of which only 112 are of more than 4 kb. 100 of these are from B19V sub-genotype 1a and were collected from a mere 6 studies conducted in 5 countries between 2002 and 2013. Nevertheless, Bayesian phylogenetic analysis of this limited set gives a date for the common ancestor of sub-genotype 1a in 1990 (95% HPD 1981-1996) which is in reasonable agreement with estimates of previous studies where collection dates have been assembled by more laborious methods of literature search and direct enquiries to sequence submitters. We conclude that although collection dates should become standard for all future GenBank submissions of virus sequences, accurate dating of ancestors is possible with even a small number of sequences if sampling information is high quality.

AB - The presence of data in the collection_date field of a GenBank sequence record is of great assistance in the use of that sequence for Bayesian phylogenetics using tip-dating. We present Tempus et Locus (TeL), a tool for extracting such sequences from a GenBank-formatted sequence database. TeL shows that 60% of viral sequences in GenBank have collection date fields, but that this varies considerably between species. Primate erythroparvovirus 1 (human parvovirus B19 or B19V) has only 40% of its sequences dated, of which only 112 are of more than 4 kb. 100 of these are from B19V sub-genotype 1a and were collected from a mere 6 studies conducted in 5 countries between 2002 and 2013. Nevertheless, Bayesian phylogenetic analysis of this limited set gives a date for the common ancestor of sub-genotype 1a in 1990 (95% HPD 1981-1996) which is in reasonable agreement with estimates of previous studies where collection dates have been assembled by more laborious methods of literature search and direct enquiries to sequence submitters. We conclude that although collection dates should become standard for all future GenBank submissions of virus sequences, accurate dating of ancestors is possible with even a small number of sequences if sampling information is high quality.

KW - primate erythroparvovirus

KW - parvovirus B19

KW - Parvoviridae

KW - phylogenetics

KW - Tempus et Locus

KW - TeL

KW - virus

KW - evolution

KW - bioinformatics

U2 - 10.1101/061697

DO - 10.1101/061697

M3 - Journal article

VL - 2016

JO - Biorxiv

JF - Biorxiv

ER -