Home > Research > Publications & Outputs > Post-archival genomics and the bulk logistics o...

Electronic data

  • Biosocieties paper

    Rights statement: This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html

    Accepted author manuscript, 602 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

  • acceptance

    Submitted manuscript, 36 KB, PDF document

Links

Text available via DOI:

View graph of relations

Post-archival genomics and the bulk logistics of DNA sequences

Research output: Contribution to journalJournal article

Published
<mark>Journal publication date</mark>03/2016
<mark>Journal</mark>BioSocieties
Issue number1
Volume11
Number of pages24
Pages (from-to)82-105
Publication statusPublished
Early online date29/06/15
Original languageEnglish

Abstract

DNA sequence data are currently viewed as a ‘bedrock’ or ‘backbone’ of modern biological science. This article traces DNA sequence data produced by so-called ‘next generation sequencing’ (NGS) platforms as it moves into a biological data infrastructure called the Sequence Read Archive (SRA). Since 2007, the SRA has been the leading repository for NGS-produced nucleotide (DNA and RNA) sequences. The way sequence data move into the SRA, we suggest, is symptomatic of a decisive shift towards post-archival genomics. This term refers to the increasing importance of the logistics rather than the biology of sequence data. In the SRA, logistical concerns with the bulk movements of sequence data somewhat supplant the emphasis in previous genomic and biological databases on contextualising particular sequences and cross-linking between different forms of biological data. At the same time, post-archival logistics do not necessarily flatten genomic research into global genomic homogeneity. Rather, the SRA provides evidence of an increasingly polymorphous flow of sequence data deriving from an expansion and diversification of sequencing techniques and instruments. The patterns of movement of data in and around the SRA suggest that sequence data are proliferating in various overlapping and sometimes disparate forms. By mapping differences in content across the SRA, by tracking patterns of absence or ‘missingness’ in metadata, and by following how changes in file formats highlight uncertainties in the definitions of seemingly obvious DNA-related artefacts such as a sequencer ‘run’, we highlight the growing lability of nucleotide sequence data. The movements of data in the SRA attest to a decisive mutation in sequences from biological bedrock to an increasingly expandable material whose epistemic and technological value remains open to reinvention.

Bibliographic note

This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html