Post-archival genomics and the bulk logistics of DNA sequences

Home > Research > Publications & Outputs > Post-archival genomics and the bulk logistics o...

Associated organisational units

Electronic data

Biosocieties paper
Rights statement: This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html
Accepted author manuscript, 602 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
acceptance
Submitted manuscript, 36 KB, PDF document

Text available via DOI:

https://doi.org/10.1057/biosoc.2015.22
Final published version

Keywords

DNA sequencing, genomics, data infrastructures, scale, value, archive

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

More...

<mark>Journal publication date</mark>	03/2016
<mark>Journal</mark>	BioSocieties
Issue number	1
Volume	11
Number of pages	24
Pages (from-to)	82-105
Publication Status	Published
Early online date	29/06/15
<mark>Original language</mark>	English

Abstract

DNA sequence data are currently viewed as a ‘bedrock’ or ‘backbone’ of modern biological science. This article traces DNA sequence data produced by so-called ‘next generation sequencing’ (NGS) platforms as it moves into a biological data infrastructure called the Sequence Read Archive (SRA). Since 2007, the SRA has been the leading repository for NGS-produced nucleotide (DNA and RNA) sequences. The way sequence data move into the SRA, we suggest, is symptomatic of a decisive shift towards post-archival genomics. This term refers to the increasing importance of the logistics rather than the biology of sequence data. In the SRA, logistical concerns with the bulk movements of sequence data somewhat supplant the emphasis in previous genomic and biological databases on contextualising particular sequences and cross-linking between different forms of biological data. At the same time, post-archival logistics do not necessarily flatten genomic research into global genomic homogeneity. Rather, the SRA provides evidence of an increasingly polymorphous flow of sequence data deriving from an expansion and diversification of sequencing techniques and instruments. The patterns of movement of data in and around the SRA suggest that sequence data are proliferating in various overlapping and sometimes disparate forms. By mapping differences in content across the SRA, by tracking patterns of absence or ‘missingness’ in metadata, and by following how changes in file formats highlight uncertainties in the definitions of seemingly obvious DNA-related artefacts such as a sequencer ‘run’, we highlight the growing lability of nucleotide sequence data. The movements of data in the SRA attest to a decisive mutation in sequences from biological bedrock to an increasingly expandable material whose epistemic and technological value remains open to reinvention.

Bibliographic note

This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords