Home > Research > Publications & Outputs > Visualization of protein sequence space with fo...

Electronic data

  • Mead et al accepted version

    Rights statement: This is the author’s version of a work that was accepted for publication in Journal of Molecular Graphics and Modelling. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Molecular Graphics and Modelling, 92, 2019 DOI: 10.1016/j.jmgm.2019.07.014

    Accepted author manuscript, 1.62 MB, PDF document

    Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Links

Text available via DOI:

View graph of relations

Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling. / Mead, Dylan J T; Lunagomez, Simón; Gatherer, Derek.
In: Journal of Molecular Graphics and Modelling, Vol. 92, 01.11.2019, p. 180-191.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Mead DJT, Lunagomez S, Gatherer D. Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling. Journal of Molecular Graphics and Modelling. 2019 Nov 1;92:180-191. Epub 2019 Jul 26. doi: 10.1016/j.jmgm.2019.07.014

Author

Bibtex

@article{b41b463f36954c4fbfe2e118195af9cb,
title = "Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling",
abstract = "The protein sequence-structure gap results from the contrast between rapid, low-cost deep sequencing, and slow, expensive experimental structure determination techniques. Comparative homology modelling may have the potential to close this gap by predicting protein structure in target sequences using existing experimentally solved structures as templates. This paper presents the first use of force-directed graphs for the visualization of sequence space in two dimensions, and applies them to the choice of suitable RNA-dependent RNA polymerase (RdRP) target-template pairs within human-infective RNA virus genera. Measures of centrality in protein sequence space for each genus were also derived and used to identify centroid nearest-neighbour sequences (CNNs) potentially useful for production of homology models most representative of their genera. Homology modelling was then carried out for target-template pairs in different species, different genera and different families, and model quality assessed using several metrics. Reconstructed ancestral RdRP sequences for individual genera were also used as templates for the production of ancestral RdRP homology models. High quality ancestral RdRP models were consistently produced, as were good quality models for target-template pairs in the same genus. Homology modelling between genera in the same family produced mixed results and inter-family modelling was unreliable. We present a protocol for the production of optimal RdRP homology models for use in further experiments, e.g. docking to discover novel anti-viral compounds. (219 words).",
keywords = "force-directed graphs, Fruchterman-Reingold, RNA-dependent RNA polymerase, virus, RNA polymerase, structural biology, structural bioinformatics, virology, bioinformatics",
author = "Mead, {Dylan J T} and Sim{\'o}n Lunagomez and Derek Gatherer",
note = "This is the author{\textquoteright}s version of a work that was accepted for publication in Journal of Molecular Graphics and Modelling. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Molecular Graphics and Modelling, 92, 2019 DOI: 10.1016/j.jmgm.2019.07.014",
year = "2019",
month = nov,
day = "1",
doi = "10.1016/j.jmgm.2019.07.014",
language = "English",
volume = "92",
pages = "180--191",
journal = "Journal of Molecular Graphics and Modelling",
issn = "1093-3263",
publisher = "Elsevier Inc.",

}

RIS

TY - JOUR

T1 - Visualization of protein sequence space with force-directed graphs, and their application to the choice of target-template pairs for homology modelling

AU - Mead, Dylan J T

AU - Lunagomez, Simón

AU - Gatherer, Derek

N1 - This is the author’s version of a work that was accepted for publication in Journal of Molecular Graphics and Modelling. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Journal of Molecular Graphics and Modelling, 92, 2019 DOI: 10.1016/j.jmgm.2019.07.014

PY - 2019/11/1

Y1 - 2019/11/1

N2 - The protein sequence-structure gap results from the contrast between rapid, low-cost deep sequencing, and slow, expensive experimental structure determination techniques. Comparative homology modelling may have the potential to close this gap by predicting protein structure in target sequences using existing experimentally solved structures as templates. This paper presents the first use of force-directed graphs for the visualization of sequence space in two dimensions, and applies them to the choice of suitable RNA-dependent RNA polymerase (RdRP) target-template pairs within human-infective RNA virus genera. Measures of centrality in protein sequence space for each genus were also derived and used to identify centroid nearest-neighbour sequences (CNNs) potentially useful for production of homology models most representative of their genera. Homology modelling was then carried out for target-template pairs in different species, different genera and different families, and model quality assessed using several metrics. Reconstructed ancestral RdRP sequences for individual genera were also used as templates for the production of ancestral RdRP homology models. High quality ancestral RdRP models were consistently produced, as were good quality models for target-template pairs in the same genus. Homology modelling between genera in the same family produced mixed results and inter-family modelling was unreliable. We present a protocol for the production of optimal RdRP homology models for use in further experiments, e.g. docking to discover novel anti-viral compounds. (219 words).

AB - The protein sequence-structure gap results from the contrast between rapid, low-cost deep sequencing, and slow, expensive experimental structure determination techniques. Comparative homology modelling may have the potential to close this gap by predicting protein structure in target sequences using existing experimentally solved structures as templates. This paper presents the first use of force-directed graphs for the visualization of sequence space in two dimensions, and applies them to the choice of suitable RNA-dependent RNA polymerase (RdRP) target-template pairs within human-infective RNA virus genera. Measures of centrality in protein sequence space for each genus were also derived and used to identify centroid nearest-neighbour sequences (CNNs) potentially useful for production of homology models most representative of their genera. Homology modelling was then carried out for target-template pairs in different species, different genera and different families, and model quality assessed using several metrics. Reconstructed ancestral RdRP sequences for individual genera were also used as templates for the production of ancestral RdRP homology models. High quality ancestral RdRP models were consistently produced, as were good quality models for target-template pairs in the same genus. Homology modelling between genera in the same family produced mixed results and inter-family modelling was unreliable. We present a protocol for the production of optimal RdRP homology models for use in further experiments, e.g. docking to discover novel anti-viral compounds. (219 words).

KW - force-directed graphs

KW - Fruchterman-Reingold

KW - RNA-dependent RNA polymerase

KW - virus

KW - RNA polymerase

KW - structural biology

KW - structural bioinformatics

KW - virology

KW - bioinformatics

U2 - 10.1016/j.jmgm.2019.07.014

DO - 10.1016/j.jmgm.2019.07.014

M3 - Journal article

C2 - 31377535

VL - 92

SP - 180

EP - 191

JO - Journal of Molecular Graphics and Modelling

JF - Journal of Molecular Graphics and Modelling

SN - 1093-3263

ER -