Home > Research > Publications & Outputs > Integrating Clustering and Regression for Workl...

Electronic data

  • Integrating Clustering and Regression for Workload Estimation in the Cloud

    Rights statement: This is the peer reviewed version of the following article: Yu, Y, Jindal, V, Yen, I‐L, Bastani, F, Xu, J, Garraghan, P. Integrating clustering and regression for workload estimation in the cloud. Concurrency Computat Pract Exper. 2020; e5931. https://doi.org/10.1002/cpe.5931 which has been published in final form at https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.5931 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.

    Accepted author manuscript, 2.72 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Integrating Clustering and Regression for Workload Estimation in the Cloud

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Integrating Clustering and Regression for Workload Estimation in the Cloud. / Yu, Yongjia; Jindal, Vasu; Yen, I-Ling et al.
In: Concurrency and Computation Practice and Experience, Vol. 32, No. 23, e5931, 10.12.2020.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Yu, Y, Jindal, V, Yen, I-L, Bastani, F, Xu, J & Garraghan, P 2020, 'Integrating Clustering and Regression for Workload Estimation in the Cloud', Concurrency and Computation Practice and Experience, vol. 32, no. 23, e5931. https://doi.org/10.1002/cpe.5931

APA

Yu, Y., Jindal, V., Yen, I-L., Bastani, F., Xu, J., & Garraghan, P. (2020). Integrating Clustering and Regression for Workload Estimation in the Cloud. Concurrency and Computation Practice and Experience, 32(23), Article e5931. https://doi.org/10.1002/cpe.5931

Vancouver

Yu Y, Jindal V, Yen I-L, Bastani F, Xu J, Garraghan P. Integrating Clustering and Regression for Workload Estimation in the Cloud. Concurrency and Computation Practice and Experience. 2020 Dec 10;32(23):e5931. Epub 2020 Jul 20. doi: 10.1002/cpe.5931

Author

Yu, Yongjia ; Jindal, Vasu ; Yen, I-Ling et al. / Integrating Clustering and Regression for Workload Estimation in the Cloud. In: Concurrency and Computation Practice and Experience. 2020 ; Vol. 32, No. 23.

Bibtex

@article{e60410b9c3444b0aabc341d67a98bd73,
title = "Integrating Clustering and Regression for Workload Estimation in the Cloud",
abstract = "Workload prediction has been widely researched in the literature. However, existing techniques are per‐job based and useful for service‐like tasks whose workloads exhibit seasonality and trend. But cloud jobs have many different workload patterns and some do not exhibit recurring workload patterns. We consider job‐pool‐based workload estimation, which analyzes the characteristics of existing tasks' workloads to estimate the currently running tasks' workload. First cluster existing tasks based on their workloads. For a new task J, collect the initial workload of J and determine which cluster J may belong to, then use the cluster's characteristics to estimate J′s workload. Based on the Google dataset, the algorithm is experimentally evaluated and its effectiveness is confirmed. However, the workload patterns of some tasks do have seasonality and trend, and conventional per‐job‐based regression methods may yield better workload prediction results. Also, in some cases, some new tasks may not follow the workload patterns of existing tasks in the pool. Thus, develop an integrated scheme which combines clustering and regression and utilize the best of them for workload prediction. Experimental study shows that the combined approach can further improve the accuracy of workload prediction.",
author = "Yongjia Yu and Vasu Jindal and I-Ling Yen and Farokh Bastani and Jie Xu and Peter Garraghan",
note = "This is the peer reviewed version of the following article: Yu, Y, Jindal, V, Yen, I‐L, Bastani, F, Xu, J, Garraghan, P. Integrating clustering and regression for workload estimation in the cloud. Concurrency Computat Pract Exper. 2020; e5931. https://doi.org/10.1002/cpe.5931 which has been published in final form at https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.5931 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.",
year = "2020",
month = dec,
day = "10",
doi = "10.1002/cpe.5931",
language = "English",
volume = "32",
journal = "Concurrency and Computation Practice and Experience",
issn = "1532-0626",
publisher = "John Wiley and Sons Ltd",
number = "23",

}

RIS

TY - JOUR

T1 - Integrating Clustering and Regression for Workload Estimation in the Cloud

AU - Yu, Yongjia

AU - Jindal, Vasu

AU - Yen, I-Ling

AU - Bastani, Farokh

AU - Xu, Jie

AU - Garraghan, Peter

N1 - This is the peer reviewed version of the following article: Yu, Y, Jindal, V, Yen, I‐L, Bastani, F, Xu, J, Garraghan, P. Integrating clustering and regression for workload estimation in the cloud. Concurrency Computat Pract Exper. 2020; e5931. https://doi.org/10.1002/cpe.5931 which has been published in final form at https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.5931 This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.

PY - 2020/12/10

Y1 - 2020/12/10

N2 - Workload prediction has been widely researched in the literature. However, existing techniques are per‐job based and useful for service‐like tasks whose workloads exhibit seasonality and trend. But cloud jobs have many different workload patterns and some do not exhibit recurring workload patterns. We consider job‐pool‐based workload estimation, which analyzes the characteristics of existing tasks' workloads to estimate the currently running tasks' workload. First cluster existing tasks based on their workloads. For a new task J, collect the initial workload of J and determine which cluster J may belong to, then use the cluster's characteristics to estimate J′s workload. Based on the Google dataset, the algorithm is experimentally evaluated and its effectiveness is confirmed. However, the workload patterns of some tasks do have seasonality and trend, and conventional per‐job‐based regression methods may yield better workload prediction results. Also, in some cases, some new tasks may not follow the workload patterns of existing tasks in the pool. Thus, develop an integrated scheme which combines clustering and regression and utilize the best of them for workload prediction. Experimental study shows that the combined approach can further improve the accuracy of workload prediction.

AB - Workload prediction has been widely researched in the literature. However, existing techniques are per‐job based and useful for service‐like tasks whose workloads exhibit seasonality and trend. But cloud jobs have many different workload patterns and some do not exhibit recurring workload patterns. We consider job‐pool‐based workload estimation, which analyzes the characteristics of existing tasks' workloads to estimate the currently running tasks' workload. First cluster existing tasks based on their workloads. For a new task J, collect the initial workload of J and determine which cluster J may belong to, then use the cluster's characteristics to estimate J′s workload. Based on the Google dataset, the algorithm is experimentally evaluated and its effectiveness is confirmed. However, the workload patterns of some tasks do have seasonality and trend, and conventional per‐job‐based regression methods may yield better workload prediction results. Also, in some cases, some new tasks may not follow the workload patterns of existing tasks in the pool. Thus, develop an integrated scheme which combines clustering and regression and utilize the best of them for workload prediction. Experimental study shows that the combined approach can further improve the accuracy of workload prediction.

U2 - 10.1002/cpe.5931

DO - 10.1002/cpe.5931

M3 - Journal article

VL - 32

JO - Concurrency and Computation Practice and Experience

JF - Concurrency and Computation Practice and Experience

SN - 1532-0626

IS - 23

M1 - e5931

ER -