Home > Research > Publications & Outputs > Post-archival genomics and the bulk logistics o...

Electronic data

  • Biosocieties paper

    Rights statement: This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html

    Accepted author manuscript, 602 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

  • acceptance

    Submitted manuscript, 36 KB, PDF document

Links

Text available via DOI:

View graph of relations

Post-archival genomics and the bulk logistics of DNA sequences

Research output: Contribution to journalJournal article

Published

Standard

Post-archival genomics and the bulk logistics of DNA sequences. / Mackenzie, Adrian; McNally, Ruth; Mills, Richard; Sharples, Stuart.

In: BioSocieties, Vol. 11, No. 1, 03.2016, p. 82-105.

Research output: Contribution to journalJournal article

Harvard

APA

Vancouver

Author

Bibtex

@article{ff6ca8d0712f4b92a66324c541866ee2,
title = "Post-archival genomics and the bulk logistics of DNA sequences",
abstract = "DNA sequence data are currently viewed as a {\textquoteleft}bedrock{\textquoteright} or {\textquoteleft}backbone{\textquoteright} of modern biological science. This article traces DNA sequence data produced by so-called {\textquoteleft}next generation sequencing{\textquoteright} (NGS) platforms as it moves into a biological data infrastructure called the Sequence Read Archive (SRA). Since 2007, the SRA has been the leading repository for NGS-produced nucleotide (DNA and RNA) sequences. The way sequence data move into the SRA, we suggest, is symptomatic of a decisive shift towards post-archival genomics. This term refers to the increasing importance of the logistics rather than the biology of sequence data. In the SRA, logistical concerns with the bulk movements of sequence data somewhat supplant the emphasis in previous genomic and biological databases on contextualising particular sequences and cross-linking between different forms of biological data. At the same time, post-archival logistics do not necessarily flatten genomic research into global genomic homogeneity. Rather, the SRA provides evidence of an increasingly polymorphous flow of sequence data deriving from an expansion and diversification of sequencing techniques and instruments. The patterns of movement of data in and around the SRA suggest that sequence data are proliferating in various overlapping and sometimes disparate forms. By mapping differences in content across the SRA, by tracking patterns of absence or {\textquoteleft}missingness{\textquoteright} in metadata, and by following how changes in file formats highlight uncertainties in the definitions of seemingly obvious DNA-related artefacts such as a sequencer {\textquoteleft}run{\textquoteright}, we highlight the growing lability of nucleotide sequence data. The movements of data in the SRA attest to a decisive mutation in sequences from biological bedrock to an increasingly expandable material whose epistemic and technological value remains open to reinvention.",
keywords = "DNA sequencing, genomics, data infrastructures, scale, value, archive",
author = "Adrian Mackenzie and Ruth McNally and Richard Mills and Stuart Sharples",
note = "This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html",
year = "2016",
month = mar,
doi = "10.1057/biosoc.2015.22",
language = "English",
volume = "11",
pages = "82--105",
journal = "BioSocieties",
issn = "1745-8552",
publisher = "Palgrave Macmillan Ltd.",
number = "1",

}

RIS

TY - JOUR

T1 - Post-archival genomics and the bulk logistics of DNA sequences

AU - Mackenzie, Adrian

AU - McNally, Ruth

AU - Mills, Richard

AU - Sharples, Stuart

N1 - This is a post-peer-review, pre-copyedit version of an article published in Biosocieties. The definitive publisher-authenticated version Post-archival genomics and the bulk logistics of DNA sequences Adrian Mackenzie, Ruth McNally, Richard Mills and Stuart Sharples is available online at: http://www.palgrave-journals.com/biosoc/journal/v11/n1/full/biosoc201522a.html

PY - 2016/3

Y1 - 2016/3

N2 - DNA sequence data are currently viewed as a ‘bedrock’ or ‘backbone’ of modern biological science. This article traces DNA sequence data produced by so-called ‘next generation sequencing’ (NGS) platforms as it moves into a biological data infrastructure called the Sequence Read Archive (SRA). Since 2007, the SRA has been the leading repository for NGS-produced nucleotide (DNA and RNA) sequences. The way sequence data move into the SRA, we suggest, is symptomatic of a decisive shift towards post-archival genomics. This term refers to the increasing importance of the logistics rather than the biology of sequence data. In the SRA, logistical concerns with the bulk movements of sequence data somewhat supplant the emphasis in previous genomic and biological databases on contextualising particular sequences and cross-linking between different forms of biological data. At the same time, post-archival logistics do not necessarily flatten genomic research into global genomic homogeneity. Rather, the SRA provides evidence of an increasingly polymorphous flow of sequence data deriving from an expansion and diversification of sequencing techniques and instruments. The patterns of movement of data in and around the SRA suggest that sequence data are proliferating in various overlapping and sometimes disparate forms. By mapping differences in content across the SRA, by tracking patterns of absence or ‘missingness’ in metadata, and by following how changes in file formats highlight uncertainties in the definitions of seemingly obvious DNA-related artefacts such as a sequencer ‘run’, we highlight the growing lability of nucleotide sequence data. The movements of data in the SRA attest to a decisive mutation in sequences from biological bedrock to an increasingly expandable material whose epistemic and technological value remains open to reinvention.

AB - DNA sequence data are currently viewed as a ‘bedrock’ or ‘backbone’ of modern biological science. This article traces DNA sequence data produced by so-called ‘next generation sequencing’ (NGS) platforms as it moves into a biological data infrastructure called the Sequence Read Archive (SRA). Since 2007, the SRA has been the leading repository for NGS-produced nucleotide (DNA and RNA) sequences. The way sequence data move into the SRA, we suggest, is symptomatic of a decisive shift towards post-archival genomics. This term refers to the increasing importance of the logistics rather than the biology of sequence data. In the SRA, logistical concerns with the bulk movements of sequence data somewhat supplant the emphasis in previous genomic and biological databases on contextualising particular sequences and cross-linking between different forms of biological data. At the same time, post-archival logistics do not necessarily flatten genomic research into global genomic homogeneity. Rather, the SRA provides evidence of an increasingly polymorphous flow of sequence data deriving from an expansion and diversification of sequencing techniques and instruments. The patterns of movement of data in and around the SRA suggest that sequence data are proliferating in various overlapping and sometimes disparate forms. By mapping differences in content across the SRA, by tracking patterns of absence or ‘missingness’ in metadata, and by following how changes in file formats highlight uncertainties in the definitions of seemingly obvious DNA-related artefacts such as a sequencer ‘run’, we highlight the growing lability of nucleotide sequence data. The movements of data in the SRA attest to a decisive mutation in sequences from biological bedrock to an increasingly expandable material whose epistemic and technological value remains open to reinvention.

KW - DNA sequencing

KW - genomics

KW - data infrastructures

KW - scale

KW - value

KW - archive

U2 - 10.1057/biosoc.2015.22

DO - 10.1057/biosoc.2015.22

M3 - Journal article

VL - 11

SP - 82

EP - 105

JO - BioSocieties

JF - BioSocieties

SN - 1745-8552

IS - 1

ER -