Home > Research > Publications & Outputs > Parallel computing TEDA for high frequency stre...

Electronic data

  • BigData16

    Rights statement: The final publication is available at Springer via http://dx.doi.org/[insert DOI]

    Accepted author manuscript, 4.77 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Parallel computing TEDA for high frequency streaming data clustering

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paper

Published

Standard

Parallel computing TEDA for high frequency streaming data clustering. / Gu, Xiaowei; Angelov, Plamen Parvanov; Gutierrez, German; Iglesias, Jose Antonio ; Sanchi, Araceli .

Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. ed. / Plamen Angelov; Yannis Manolopoulos; Lazaros Iliadis; Asim Roy; Marley Vellasco. Cham : Springer, 2016. p. 238-253.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paper

Harvard

Gu, X, Angelov, PP, Gutierrez, G, Iglesias, JA & Sanchi, A 2016, Parallel computing TEDA for high frequency streaming data clustering. in P Angelov, Y Manolopoulos, L Iliadis, A Roy & M Vellasco (eds), Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. Springer, Cham, pp. 238-253, 2nd International Neural Network Society Conference on Big Data, INNS 2016, Thessaloniki, Greece, 23/10/16. https://doi.org/10.1007/978-3-319-47898-2_25

APA

Gu, X., Angelov, P. P., Gutierrez, G., Iglesias, J. A., & Sanchi, A. (2016). Parallel computing TEDA for high frequency streaming data clustering. In P. Angelov, Y. Manolopoulos, L. Iliadis, A. Roy, & M. Vellasco (Eds.), Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece (pp. 238-253). Springer. https://doi.org/10.1007/978-3-319-47898-2_25

Vancouver

Gu X, Angelov PP, Gutierrez G, Iglesias JA, Sanchi A. Parallel computing TEDA for high frequency streaming data clustering. In Angelov P, Manolopoulos Y, Iliadis L, Roy A, Vellasco M, editors, Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. Cham: Springer. 2016. p. 238-253 https://doi.org/10.1007/978-3-319-47898-2_25

Author

Gu, Xiaowei ; Angelov, Plamen Parvanov ; Gutierrez, German ; Iglesias, Jose Antonio ; Sanchi, Araceli . / Parallel computing TEDA for high frequency streaming data clustering. Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, October 23-25, 2016, Thessaloniki, Greece. editor / Plamen Angelov ; Yannis Manolopoulos ; Lazaros Iliadis ; Asim Roy ; Marley Vellasco. Cham : Springer, 2016. pp. 238-253

Bibtex

@inproceedings{435cc049019a4284964c3ecaaff36030,
title = "Parallel computing TEDA for high frequency streaming data clustering",
abstract = "In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.",
author = "Xiaowei Gu and Angelov, {Plamen Parvanov} and German Gutierrez and Iglesias, {Jose Antonio} and Araceli Sanchi",
year = "2016",
month = oct
day = "23",
doi = "10.1007/978-3-319-47898-2_25",
language = "English",
isbn = "9783319478975",
pages = "238--253",
editor = "Plamen Angelov and Yannis Manolopoulos and Lazaros Iliadis and Asim Roy and Marley Vellasco",
booktitle = "Advances in Big Data",
publisher = "Springer",
note = "2nd International Neural Network Society Conference on Big Data, INNS 2016 ; Conference date: 23-10-2016 Through 25-10-2016",

}

RIS

TY - GEN

T1 - Parallel computing TEDA for high frequency streaming data clustering

AU - Gu, Xiaowei

AU - Angelov, Plamen Parvanov

AU - Gutierrez, German

AU - Iglesias, Jose Antonio

AU - Sanchi, Araceli

PY - 2016/10/23

Y1 - 2016/10/23

N2 - In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.

AB - In this paper, a novel online clustering approach called Parallel_TEDA is introduced for processing high frequency streaming data. This newly proposed approach is developed within the recently introduced TEDA theory and inherits all advantages from it. In the proposed approach, a number of data stream processors are involved, which collaborate with each other efficiently to achieve parallel computation as well as a much higher processing speed. A fusion center is involved to gather the key information from the processors which work on chunks of the whole data stream and generate the overall output. The quality of the generated clusters is being monitored within the data processors all the time and stale clusters are being removed to ensure the correctness and timeliness of the overall clustering results. This, in turn, gives the proposed approach a stronger ability of handling shifts/drifts that may take place in live data streams. The numerical experiments performed with the proposed new approach Parallel_TEDA on benchmark datasets present higher performance and faster processing speed when compared with the alternative well-known approaches. The processing speed has been demonstrated to fall exponentially with more data processors involved. This new online clustering approach is very suitable and promising for real-time high frequency streaming processing and data analytics.

U2 - 10.1007/978-3-319-47898-2_25

DO - 10.1007/978-3-319-47898-2_25

M3 - Conference contribution/Paper

SN - 9783319478975

SP - 238

EP - 253

BT - Advances in Big Data

A2 - Angelov, Plamen

A2 - Manolopoulos, Yannis

A2 - Iliadis, Lazaros

A2 - Roy, Asim

A2 - Vellasco, Marley

PB - Springer

CY - Cham

T2 - 2nd International Neural Network Society Conference on Big Data, INNS 2016

Y2 - 23 October 2016 through 25 October 2016

ER -