Home > Research > Publications & Outputs > Divisive clustering of high dimensional data st...

Electronic data

  • Divisive_Clustering_of_High_Dimensional_Data_Streams

    Rights statement: The final publication is available at Springer via http://dx.doi.org/10.1007/s11222-015-9597-y

    Accepted author manuscript, 643 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Divisive clustering of high dimensional data streams

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
<mark>Journal publication date</mark>09/2016
<mark>Journal</mark>Statistics and Computing
Issue number5
Volume26
Number of pages20
Pages (from-to)1101–1120
Publication StatusPublished
Early online date31/07/15
<mark>Original language</mark>English

Abstract

Clustering streaming data is gaining importance as automatic data acquisition technologies are deployed in diverse applications. We propose a fully incremental projected divisive clustering method for high-dimensional data streams that is motivated by high density clustering. The method is capable of identifying clusters in arbitrary subspaces, estimating the number of clusters, and detecting changes in the data distribution which necessitate a revision of the model. The empirical evaluation of the proposed method on numerous real and simulated datasets shows that it is scalable in dimension and number of clusters, is robust to noisy and irrelevant features, and is capable of handling a variety of types of non-stationarity.

Bibliographic note

Publication is available at: http://link.springer.com/article/10.1007%2Fs11222-015-9597-y