Home > Research > Publications & Outputs > Tempus et Locus: a tool for extracting precisel...

Electronic data

  • 061697.full

    Submitted manuscript, 541 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Tempus et Locus: a tool for extracting precisely dated viral sequences from GenBank, and its application to the phylogenetics of primate erythroparvovirus 1 (B19V)

Research output: Contribution to journalJournal article

Published
<mark>Journal publication date</mark>4/07/2016
<mark>Journal</mark>Biorxiv
Volume2016
Number of pages13
Publication StatusPublished
<mark>Original language</mark>English

Abstract

The presence of data in the collection_date field of a GenBank sequence record is of great assistance in the use of that sequence for Bayesian phylogenetics using tip-dating. We present Tempus et Locus (TeL), a tool for extracting such sequences from a GenBank-formatted sequence database. TeL shows that 60% of viral sequences in GenBank have collection date fields, but that this varies considerably between species. Primate erythroparvovirus 1 (human parvovirus B19 or B19V) has only 40% of its sequences dated, of which only 112 are of more than 4 kb. 100 of these are from B19V sub-genotype 1a and were collected from a mere 6 studies conducted in 5 countries between 2002 and 2013. Nevertheless, Bayesian phylogenetic analysis of this limited set gives a date for the common ancestor of sub-genotype 1a in 1990 (95% HPD 1981-1996) which is in reasonable agreement with estimates of previous studies where collection dates have been assembled by more laborious methods of literature search and direct enquiries to sequence submitters. We conclude that although collection dates should become standard for all future GenBank submissions of virus sequences, accurate dating of ancestors is possible with even a small number of sequences if sampling information is high quality.