Home > Research > Publications & Outputs > CoFlux
View graph of relations

CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting. / Su, Ya ; Zhao, Youjian; Xia, Wentao et al.
2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). IEEE, 2020.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Su, Y, Zhao, Y, Xia, W, Liu, R, Bu, J, Zhu, J, Cao, Y, Li, H, Niu, C, Zhang, Y, Wang, Z & Pei, D 2020, CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting. in 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). IEEE. <https://ieeexplore.ieee.org/document/9068623>

APA

Su, Y., Zhao, Y., Xia, W., Liu, R., Bu, J., Zhu, J., Cao, Y., Li, H., Niu, C., Zhang, Y., Wang, Z., & Pei, D. (2020). CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting. In 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS) IEEE. https://ieeexplore.ieee.org/document/9068623

Vancouver

Su Y, Zhao Y, Xia W, Liu R, Bu J, Zhu J et al. CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting. In 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). IEEE. 2020

Author

Su, Ya ; Zhao, Youjian ; Xia, Wentao et al. / CoFlux : Robustly Correlating KPIs by Fluctuations for Service Troubleshooting. 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). IEEE, 2020.

Bibtex

@inproceedings{5caa3aee8f784fdfa986e80f8de854a8,
title = "CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting",
abstract = "Internet-based service companies monitor a large number of KPIs (Key Performance Indicators) to ensure their service quality and reliability. Correlating KPIs by fluctuations reveals interactions between KPIs under anomalous situations and can be extremely useful for service troubleshooting. However, such a KPI flux-correlation has been little studied so far in the domain of Internet service operations management. A major challenge is how to automatically and accurately separate fluctuations from normal variations in KPIs with different structural characteristics (such as seasonal, trend and stationary) for a large number of KPIs. In this paper, we propose CoFlux, an unsupervised approach, to automatically (without manual selection of algorithm fitting and parameter tuning) determine whether two KPIs are correlated by fluctuations, in what temporal order they fluctuate, and whether they fluctuate in the same direction. CoFlux's robust feature engineering and robust correlation score computation enable it to work well against the diverse KPI characteristics. Our extensive experiments have demonstrated that CoFlux achieves the best Fl-Scores of 0.84 (0.90),0.92 (0.95), 0.95 (0.99), in answering these three questions, in the two real datasets from a top global Internet company, respectively. Moreover, we showed that CoFlux is effective in assisting service troubleshooting through the applications of alert compression, recommending Top N causes, and constructing fluctuation propagation chains.",
author = "Ya Su and Youjian Zhao and Wentao Xia and Rong Liu and Jiahao Bu and Jing Zhu and Yuanpu Cao and Haibin Li and Chenhao Niu and Yiyin Zhang and Zhaogang Wang and Dan Pei",
note = "@INPROCEEDINGS{9068623, author={Su, Ya and Zhao, Youjian and Xia, Wentao and Liu, Rong and Bu, Jiahao and Zhu, Jing and Cao, Yuanpu and Li, Haibin and Niu, Chenhao and Zhang, Yiyin and Wang, Zhaogang and Pei, Dan}, booktitle={2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS)}, title={CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting}, year={2019}, volume={}, number={}, pages={1-10}, doi={10.1145/3326285.3329048}}",
year = "2020",
month = apr,
day = "16",
language = "English",
booktitle = "2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS)",
publisher = "IEEE",

}

RIS

TY - GEN

T1 - CoFlux

T2 - Robustly Correlating KPIs by Fluctuations for Service Troubleshooting

AU - Su, Ya

AU - Zhao, Youjian

AU - Xia, Wentao

AU - Liu, Rong

AU - Bu, Jiahao

AU - Zhu, Jing

AU - Cao, Yuanpu

AU - Li, Haibin

AU - Niu, Chenhao

AU - Zhang, Yiyin

AU - Wang, Zhaogang

AU - Pei, Dan

N1 - @INPROCEEDINGS{9068623, author={Su, Ya and Zhao, Youjian and Xia, Wentao and Liu, Rong and Bu, Jiahao and Zhu, Jing and Cao, Yuanpu and Li, Haibin and Niu, Chenhao and Zhang, Yiyin and Wang, Zhaogang and Pei, Dan}, booktitle={2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS)}, title={CoFlux: Robustly Correlating KPIs by Fluctuations for Service Troubleshooting}, year={2019}, volume={}, number={}, pages={1-10}, doi={10.1145/3326285.3329048}}

PY - 2020/4/16

Y1 - 2020/4/16

N2 - Internet-based service companies monitor a large number of KPIs (Key Performance Indicators) to ensure their service quality and reliability. Correlating KPIs by fluctuations reveals interactions between KPIs under anomalous situations and can be extremely useful for service troubleshooting. However, such a KPI flux-correlation has been little studied so far in the domain of Internet service operations management. A major challenge is how to automatically and accurately separate fluctuations from normal variations in KPIs with different structural characteristics (such as seasonal, trend and stationary) for a large number of KPIs. In this paper, we propose CoFlux, an unsupervised approach, to automatically (without manual selection of algorithm fitting and parameter tuning) determine whether two KPIs are correlated by fluctuations, in what temporal order they fluctuate, and whether they fluctuate in the same direction. CoFlux's robust feature engineering and robust correlation score computation enable it to work well against the diverse KPI characteristics. Our extensive experiments have demonstrated that CoFlux achieves the best Fl-Scores of 0.84 (0.90),0.92 (0.95), 0.95 (0.99), in answering these three questions, in the two real datasets from a top global Internet company, respectively. Moreover, we showed that CoFlux is effective in assisting service troubleshooting through the applications of alert compression, recommending Top N causes, and constructing fluctuation propagation chains.

AB - Internet-based service companies monitor a large number of KPIs (Key Performance Indicators) to ensure their service quality and reliability. Correlating KPIs by fluctuations reveals interactions between KPIs under anomalous situations and can be extremely useful for service troubleshooting. However, such a KPI flux-correlation has been little studied so far in the domain of Internet service operations management. A major challenge is how to automatically and accurately separate fluctuations from normal variations in KPIs with different structural characteristics (such as seasonal, trend and stationary) for a large number of KPIs. In this paper, we propose CoFlux, an unsupervised approach, to automatically (without manual selection of algorithm fitting and parameter tuning) determine whether two KPIs are correlated by fluctuations, in what temporal order they fluctuate, and whether they fluctuate in the same direction. CoFlux's robust feature engineering and robust correlation score computation enable it to work well against the diverse KPI characteristics. Our extensive experiments have demonstrated that CoFlux achieves the best Fl-Scores of 0.84 (0.90),0.92 (0.95), 0.95 (0.99), in answering these three questions, in the two real datasets from a top global Internet company, respectively. Moreover, we showed that CoFlux is effective in assisting service troubleshooting through the applications of alert compression, recommending Top N causes, and constructing fluctuation propagation chains.

M3 - Conference contribution/Paper

BT - 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS)

PB - IEEE

ER -