Home > Research > Publications & Outputs > Analysis of sequence periodicity in E-coli prot...
View graph of relations

Analysis of sequence periodicity in E-coli proteins: empirical investigation of the "duplication and divergence" theory of protein evolution

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>08/2003
<mark>Journal</mark>Journal of Molecular Evolution
Issue number2
Number of pages10
Pages (from-to)149-158
Publication StatusPublished
<mark>Original language</mark>English


Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period it = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p <0.01, and 30 at p <0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the "duplication and divergence" model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within alpha-helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.