Parallel computing TEDA for high frequency streaming data clustering

Computing and Communications

Associated organisational units

Electronic data

BigData16
Rights statement: The final publication is available at Springer via http://dx.doi.org/[insert DOI]
Accepted author manuscript, 4.77 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1007/978-3-319-47898-2_25
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Parallel computing TEDA for high frequency streaming data clustering. / Gu, Xiaowei ; Angelov, Plamen Parvanov; Gutierrez, German et al.
Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. ed. / Plamen Angelov; Yannis Manolopoulos; Lazaros Iliadis; Asim Roy; Marley Vellasco. Cham: Springer, 2016. p. 238-253.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Gu, X , Angelov, PP, Gutierrez, G, Iglesias, JA & Sanchi, A 2016, Parallel computing TEDA for high frequency streaming data clustering. in P Angelov, Y Manolopoulos, L Iliadis, A Roy & M Vellasco (eds), Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. Springer, Cham, pp. 238-253, 2nd International Neural Network Society Conference on Big Data, INNS 2016, Thessaloniki, Greece, 23/10/16. https://doi.org/10.1007/978-3-319-47898-2_25

APA

Gu, X., Angelov, P. P., Gutierrez, G., Iglesias, J. A., & Sanchi, A. (2016). Parallel computing TEDA for high frequency streaming data clustering. In P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, & M. Vellasco (Eds.), Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece (pp. 238-253). Springer. https://doi.org/10.1007/978-3-319-47898-2_25

Vancouver

Gu X , Angelov PP, Gutierrez G, Iglesias JA, Sanchi A. Parallel computing TEDA for high frequency streaming data clustering. In Angelov P, Manolopoulos Y, Iliadis L, Roy A, Vellasco M, editors, Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. Cham: Springer. 2016. p. 238-253 doi: 10.1007/978-3-319-47898-2_25

Author

Gu, Xiaowei ; Angelov, Plamen Parvanov ; Gutierrez, German et al. / Parallel computing TEDA for high frequency streaming data clustering. Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. editor / Plamen Angelov ; Yannis Manolopoulos ; Lazaros Iliadis ; Asim Roy ; Marley Vellasco. Cham : Springer, 2016. pp. 238-253

Bibtex

@inproceedings{435cc049019a4284964c3ecaaff36030,

title = "Parallel computing TEDA for high frequency streaming data clustering",

abstract = "In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.",

author = "Xiaowei Gu and Angelov, {Plamen Parvanov} and German Gutierrez and Iglesias, {Jose Antonio} and Araceli Sanchi",

year = "2016",

month = oct,

day = "23",

doi = "10.1007/978-3-319-47898-2_25",

language = "English",

isbn = "9783319478975",

pages = "238--253",

editor = "Plamen Angelov and Yannis Manolopoulos and Lazaros Iliadis and Asim Roy and Marley Vellasco",

booktitle = "Advances in Big Data",

publisher = "Springer",

note = "2nd International Neural Network Society Conference on Big Data, INNS 2016 ; Conference date: 23-10-2016 Through 25-10-2016",

}

RIS

TY - GEN

T1 - Parallel computing TEDA for high frequency streaming data clustering

AU - Gu, Xiaowei

AU - Angelov, Plamen Parvanov

AU - Gutierrez, German

AU - Iglesias, Jose Antonio

AU - Sanchi, Araceli

PY - 2016/10/23

Y1 - 2016/10/23

N2 - In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.

AB - In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.

U2 - 10.1007/978-3-319-47898-2_25

DO - 10.1007/978-3-319-47898-2_25

M3 - Conference contribution/Paper

SN - 9783319478975

SP - 238

EP - 253

BT - Advances in Big Data

A2 - Angelov, Plamen

A2 - Manolopoulos, Yannis

A2 - Iliadis, Lazaros

A2 - Roy, Asim

A2 - Vellasco, Marley

PB - Springer

CY - Cham

T2 - 2nd International Neural Network Society Conference on Big Data, INNS 2016

Y2 - 23 October 2016 through 25 October 2016

ER -

Research

Associated organisational units

Electronic data

Links

Text available via DOI: