Home > Research > Publications & Outputs > An evolving approach to data streams clustering...

Electronic data

  • Paper_Information_Sciences_Revised auto-cloud 2020

    Rights statement: This is the author’s version of a work that was accepted for publication in Information Sciences. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information Sciences, 518, 2020 DOI: 10.1016/j.ins.2019.12.022

    Accepted author manuscript, 3.58 MB, PDF document

    Available under license: CC BY-NC-ND

Links

Text available via DOI:

View graph of relations

An evolving approach to data streams clustering based on typicality and eccentricity data analytics

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

An evolving approach to data streams clustering based on typicality and eccentricity data analytics. / Bezerra, C.G.; Costa, B.S.J.; Guedes, L.A. et al.
In: Information Sciences, Vol. 518, 31.05.2020, p. 13-28.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Bezerra CG, Costa BSJ, Guedes LA, Angelov PP. An evolving approach to data streams clustering based on typicality and eccentricity data analytics. Information Sciences. 2020 May 31;518:13-28. Epub 2020 Jan 2. doi: 10.1016/j.ins.2019.12.022

Author

Bezerra, C.G. ; Costa, B.S.J. ; Guedes, L.A. et al. / An evolving approach to data streams clustering based on typicality and eccentricity data analytics. In: Information Sciences. 2020 ; Vol. 518. pp. 13-28.

Bibtex

@article{4a2876f167264be985809c28971838b4,
title = "An evolving approach to data streams clustering based on typicality and eccentricity data analytics",
abstract = "In this paper we propose an algorithm for online clustering of data streams. This algorithm is called AutoCloud and is based on the recently introduced concept of Typicality and Eccentricity Data Analytics, mainly used for anomaly detection tasks. AutoCloud is an evolving, online and recursive technique that does not need training or prior knowledge about the data set. Thus, AutoCloud is fully online, requiring no offline processing. It allows creation and merging of clusters autonomously as new data observations become available. The clusters created by AutoCloud are called data clouds, which are structures without pre-defined shape or boundaries. AutoCloud allows each data sample to belong to multiple data clouds simultaneously using fuzzy concepts. AutoCloud is also able to handle concept drift and concept evolution, which are problems that are inherent in data streams in general. Since the algorithm is recursive and online, it is suitable for applications that require a real-time response. We validate our proposal with applications to multiple well known data sets in the literature.",
keywords = "Online clustering, Data stream, Eccentricity, Typicality, Anomaly detection",
author = "C.G. Bezerra and B.S.J. Costa and L.A. Guedes and P.P. Angelov",
note = "This is the author{\textquoteright}s version of a work that was accepted for publication in Information Sciences. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information Sciences, 518, 2020 DOI: 10.1016/j.ins.2019.12.022",
year = "2020",
month = may,
day = "31",
doi = "10.1016/j.ins.2019.12.022",
language = "English",
volume = "518",
pages = "13--28",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",

}

RIS

TY - JOUR

T1 - An evolving approach to data streams clustering based on typicality and eccentricity data analytics

AU - Bezerra, C.G.

AU - Costa, B.S.J.

AU - Guedes, L.A.

AU - Angelov, P.P.

N1 - This is the author’s version of a work that was accepted for publication in Information Sciences. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information Sciences, 518, 2020 DOI: 10.1016/j.ins.2019.12.022

PY - 2020/5/31

Y1 - 2020/5/31

N2 - In this paper we propose an algorithm for online clustering of data streams. This algorithm is called AutoCloud and is based on the recently introduced concept of Typicality and Eccentricity Data Analytics, mainly used for anomaly detection tasks. AutoCloud is an evolving, online and recursive technique that does not need training or prior knowledge about the data set. Thus, AutoCloud is fully online, requiring no offline processing. It allows creation and merging of clusters autonomously as new data observations become available. The clusters created by AutoCloud are called data clouds, which are structures without pre-defined shape or boundaries. AutoCloud allows each data sample to belong to multiple data clouds simultaneously using fuzzy concepts. AutoCloud is also able to handle concept drift and concept evolution, which are problems that are inherent in data streams in general. Since the algorithm is recursive and online, it is suitable for applications that require a real-time response. We validate our proposal with applications to multiple well known data sets in the literature.

AB - In this paper we propose an algorithm for online clustering of data streams. This algorithm is called AutoCloud and is based on the recently introduced concept of Typicality and Eccentricity Data Analytics, mainly used for anomaly detection tasks. AutoCloud is an evolving, online and recursive technique that does not need training or prior knowledge about the data set. Thus, AutoCloud is fully online, requiring no offline processing. It allows creation and merging of clusters autonomously as new data observations become available. The clusters created by AutoCloud are called data clouds, which are structures without pre-defined shape or boundaries. AutoCloud allows each data sample to belong to multiple data clouds simultaneously using fuzzy concepts. AutoCloud is also able to handle concept drift and concept evolution, which are problems that are inherent in data streams in general. Since the algorithm is recursive and online, it is suitable for applications that require a real-time response. We validate our proposal with applications to multiple well known data sets in the literature.

KW - Online clustering

KW - Data stream

KW - Eccentricity

KW - Typicality

KW - Anomaly detection

U2 - 10.1016/j.ins.2019.12.022

DO - 10.1016/j.ins.2019.12.022

M3 - Journal article

VL - 518

SP - 13

EP - 28

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

ER -