Home > Research > Publications & Outputs > Analysis of sequence periodicity in E-coli prot...
View graph of relations

Analysis of sequence periodicity in E-coli proteins: empirical investigation of the "duplication and divergence" theory of protein evolution

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Analysis of sequence periodicity in E-coli proteins: empirical investigation of the "duplication and divergence" theory of protein evolution. / Gatherer, Derek; McEwan, Neil R.
In: Journal of Molecular Evolution, Vol. 57, No. 2, 08.2003, p. 149-158.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Gatherer D, McEwan NR. Analysis of sequence periodicity in E-coli proteins: empirical investigation of the "duplication and divergence" theory of protein evolution. Journal of Molecular Evolution. 2003 Aug;57(2):149-158. doi: 10.1007/s00239-002-2462-1

Author

Bibtex

@article{d4ad4379d00749e5b42cb9780fe49d73,
title = "Analysis of sequence periodicity in E-coli proteins: empirical investigation of the {"}duplication and divergence{"} theory of protein evolution",
abstract = "Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period it = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p <0.01, and 30 at p <0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the {"}duplication and divergence{"} model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within alpha-helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.",
keywords = "periodicity, proteome, E. coli, gene duplication, STATISTICAL-ANALYSIS, CODING SEQUENCES",
author = "Derek Gatherer and McEwan, {Neil R.}",
year = "2003",
month = aug,
doi = "10.1007/s00239-002-2462-1",
language = "English",
volume = "57",
pages = "149--158",
journal = "Journal of Molecular Evolution",
issn = "0022-2844",
publisher = "Springer New York",
number = "2",

}

RIS

TY - JOUR

T1 - Analysis of sequence periodicity in E-coli proteins

T2 - empirical investigation of the "duplication and divergence" theory of protein evolution

AU - Gatherer, Derek

AU - McEwan, Neil R.

PY - 2003/8

Y1 - 2003/8

N2 - Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period it = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p <0.01, and 30 at p <0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the "duplication and divergence" model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within alpha-helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.

AB - Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period it = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p <0.01, and 30 at p <0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the "duplication and divergence" model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within alpha-helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.

KW - periodicity

KW - proteome

KW - E. coli

KW - gene duplication

KW - STATISTICAL-ANALYSIS

KW - CODING SEQUENCES

U2 - 10.1007/s00239-002-2462-1

DO - 10.1007/s00239-002-2462-1

M3 - Journal article

VL - 57

SP - 149

EP - 158

JO - Journal of Molecular Evolution

JF - Journal of Molecular Evolution

SN - 0022-2844

IS - 2

ER -