Home > Research > Publications & Outputs > A new online clustering approach for data in ar...

Electronic data

  • CYBCONF2015_CODAS

    Rights statement: ©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    Accepted author manuscript, 646 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

A new online clustering approach for data in arbitrary shaped clusters

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

A new online clustering approach for data in arbitrary shaped clusters. / Hyde, Richard; Angelov, Plamen.
Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on . IEEE, 2015. p. 228-233 78.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Hyde, R & Angelov, P 2015, A new online clustering approach for data in arbitrary shaped clusters. in Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on ., 78, IEEE, pp. 228-233, CYBCONF, Gdynia, Poland, 24/06/15. https://doi.org/10.1109/CYBConf.2015.7175937

APA

Hyde, R., & Angelov, P. (2015). A new online clustering approach for data in arbitrary shaped clusters. In Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on (pp. 228-233). Article 78 IEEE. https://doi.org/10.1109/CYBConf.2015.7175937

Vancouver

Hyde R, Angelov P. A new online clustering approach for data in arbitrary shaped clusters. In Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on . IEEE. 2015. p. 228-233. 78 doi: 10.1109/CYBConf.2015.7175937

Author

Hyde, Richard ; Angelov, Plamen. / A new online clustering approach for data in arbitrary shaped clusters. Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on . IEEE, 2015. pp. 228-233

Bibtex

@inproceedings{f764eab8eaf647caa31673562d0b129c,
title = "A new online clustering approach for data in arbitrary shaped clusters",
abstract = "In this paper we demonstrate a new density based clustering technique, CODAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.",
keywords = "clustering, CODAS, online, data streams, big data, arbitrary shape, micro-cluster",
author = "Richard Hyde and Plamen Angelov",
note = "{\textcopyright}2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.; CYBCONF ; Conference date: 24-06-2015 Through 26-06-2015",
year = "2015",
month = jun,
day = "24",
doi = "10.1109/CYBConf.2015.7175937",
language = "English",
isbn = "9781479983209",
pages = "228--233",
booktitle = "Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on",
publisher = "IEEE",

}

RIS

TY - GEN

T1 - A new online clustering approach for data in arbitrary shaped clusters

AU - Hyde, Richard

AU - Angelov, Plamen

N1 - ©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PY - 2015/6/24

Y1 - 2015/6/24

N2 - In this paper we demonstrate a new density based clustering technique, CODAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.

AB - In this paper we demonstrate a new density based clustering technique, CODAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.

KW - clustering

KW - CODAS

KW - online

KW - data streams

KW - big data

KW - arbitrary shape

KW - micro-cluster

U2 - 10.1109/CYBConf.2015.7175937

DO - 10.1109/CYBConf.2015.7175937

M3 - Conference contribution/Paper

SN - 9781479983209

SP - 228

EP - 233

BT - Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on

PB - IEEE

T2 - CYBCONF

Y2 - 24 June 2015 through 26 June 2015

ER -