Predictive topology refinements in distributed stream processing system

Computing and Communications

Text available via DOI:

https://doi.org/10.1371/journal.pone.0240424
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

article, feedback system, human, prediction, workload

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Predictive topology refinements in distributed stream processing system. / Hanif, M.; Lee, C.; Helal, S.
In: PLoS ONE, Vol. 15, No. 11, e0240424, 05.11.2020.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Hanif, M, Lee, C & Helal, S 2020, 'Predictive topology refinements in distributed stream processing system', PLoS ONE, vol. 15, no. 11, e0240424. https://doi.org/10.1371/journal.pone.0240424

APA

Hanif, M., Lee, C., & Helal, S. (2020). Predictive topology refinements in distributed stream processing system. PLoS ONE, 15(11), Article e0240424. https://doi.org/10.1371/journal.pone.0240424

Vancouver

Hanif M, Lee C, Helal S. Predictive topology refinements in distributed stream processing system. PLoS ONE. 2020 Nov 5;15(11):e0240424. doi: 10.1371/journal.pone.0240424

Author

Hanif, M. ; Lee, C. ; Helal, S. / Predictive topology refinements in distributed stream processing system. In: PLoS ONE. 2020 ; Vol. 15, No. 11.

Bibtex

@article{a00623f4e5e1432c95677261ee752646,

title = "Predictive topology refinements in distributed stream processing system",

abstract = "Cloud computing has evolved the big data technologies to a consolidated paradigm with SPaaS (Streaming processing-as-a-service). With a number of enterprises offering cloudbased solutions to end-users and other small enterprises, there has been a boom in the volume of data, creating interest of both industry and academia in big data analytics, streaming applications, and social networking applications. With the companies shifting to cloudbased solutions as a service paradigm, the competition grows in the market. Good quality of service (QoS) is a must for the enterprises, as they strive to survive in a competitive environment. However, achieving reasonable QoS goals to meet SLA agreement cost-effectively is challenging due to variation in workload over time. This problem can be solved if the system has the ability to predict the workload for the near future. In this paper, we present a novel topology-refining scheme based on a workload prediction mechanism. Predictions are made through a model based on a combination of SVR, autoregressive, and moving average model with a feedback mechanism. Our streaming system is designed to increase the overall performance by making the topology refining robust to the incoming workload on the fly, while still being able to achieve QoS goals of SLA constraints. Apache Flink distributed processing engine is used as a testbed in the paper. The result shows that the prediction scheme works well for both workloads, i.e., synthetic as well as real traces of data.",

keywords = "article, feedback system, human, prediction, workload",

author = "M. Hanif and C. Lee and S. Helal",

year = "2020",

month = nov,

day = "5",

doi = "10.1371/journal.pone.0240424",

language = "English",

volume = "15",

journal = "PLoS ONE",

issn = "1932-6203",

publisher = "Public Library of Science",

number = "11",

}

RIS

TY - JOUR

T1 - Predictive topology refinements in distributed stream processing system

AU - Hanif, M.

AU - Lee, C.

AU - Helal, S.

PY - 2020/11/5

Y1 - 2020/11/5

N2 - Cloud computing has evolved the big data technologies to a consolidated paradigm with SPaaS (Streaming processing-as-a-service). With a number of enterprises offering cloudbased solutions to end-users and other small enterprises, there has been a boom in the volume of data, creating interest of both industry and academia in big data analytics, streaming applications, and social networking applications. With the companies shifting to cloudbased solutions as a service paradigm, the competition grows in the market. Good quality of service (QoS) is a must for the enterprises, as they strive to survive in a competitive environment. However, achieving reasonable QoS goals to meet SLA agreement cost-effectively is challenging due to variation in workload over time. This problem can be solved if the system has the ability to predict the workload for the near future. In this paper, we present a novel topology-refining scheme based on a workload prediction mechanism. Predictions are made through a model based on a combination of SVR, autoregressive, and moving average model with a feedback mechanism. Our streaming system is designed to increase the overall performance by making the topology refining robust to the incoming workload on the fly, while still being able to achieve QoS goals of SLA constraints. Apache Flink distributed processing engine is used as a testbed in the paper. The result shows that the prediction scheme works well for both workloads, i.e., synthetic as well as real traces of data.

AB - Cloud computing has evolved the big data technologies to a consolidated paradigm with SPaaS (Streaming processing-as-a-service). With a number of enterprises offering cloudbased solutions to end-users and other small enterprises, there has been a boom in the volume of data, creating interest of both industry and academia in big data analytics, streaming applications, and social networking applications. With the companies shifting to cloudbased solutions as a service paradigm, the competition grows in the market. Good quality of service (QoS) is a must for the enterprises, as they strive to survive in a competitive environment. However, achieving reasonable QoS goals to meet SLA agreement cost-effectively is challenging due to variation in workload over time. This problem can be solved if the system has the ability to predict the workload for the near future. In this paper, we present a novel topology-refining scheme based on a workload prediction mechanism. Predictions are made through a model based on a combination of SVR, autoregressive, and moving average model with a feedback mechanism. Our streaming system is designed to increase the overall performance by making the topology refining robust to the incoming workload on the fly, while still being able to achieve QoS goals of SLA constraints. Apache Flink distributed processing engine is used as a testbed in the paper. The result shows that the prediction scheme works well for both workloads, i.e., synthetic as well as real traces of data.

KW - article

KW - feedback system

KW - human

KW - prediction

KW - workload

U2 - 10.1371/journal.pone.0240424

DO - 10.1371/journal.pone.0240424

M3 - Journal article

VL - 15

JO - PLoS ONE

JF - PLoS ONE

SN - 1932-6203

IS - 11

M1 - e0240424

ER -

Research

Links

Text available via DOI:

Keywords