Home > Research > Publications & Outputs > Spectral density ratio based clustering methods...


Text available via DOI:

View graph of relations

Spectral density ratio based clustering methods for the binary segmentation of protein sequences: A comparative study

Research output: Contribution to journalJournal article

<mark>Journal publication date</mark>05/2010
Issue number2
Number of pages12
Pages (from-to)132-143
Publication statusPublished
Early online date4/03/10
Original languageEnglish


We compare several spectral domain based clustering methods for partitioning protein sequence data. The main instrument for this exercise is the spectral density ratio model, which specifies that the logarithmic ratio of two or more unknown spectral density functions has a parametric linear combination of cosines. Maximum likelihood inference is worked out in detail and it is shown that its output yields several distance measures among independent stationary time series. These similarity indices are suitable for clustering time series data based on their second order properties. Other spectral domain based distances are investigated as well; and we compare all methods and distances to the problem of producing segmentations of bacterial outer membrane proteins consistent with their transmembrane topology. Protein sequences are transformed to time series data by employing numerical scales of physicochemical parameters. We also present interesting results on the prediction of transmembrane -strands, based on the clustering outcome, for a representative set of bacterial outer membrane proteins with given three-dimensional structure.