Final published version
Licence: CC BY
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - An analysis of failure-related energy waste in a large-scale cloud environment
AU - Garraghan, Peter
AU - Moreno, Ismael Solis
AU - Townend, Paul
AU - Xu, Jie
PY - 2014/6/30
Y1 - 2014/6/30
N2 - Cloud computing providers are under great pressure to reduce operational costs through improved energy utilization while provisioning dependable service to customers; it is therefore extremely important to understand and quantify the explicit impact of failures within a system in terms of energy costs. This paper presents the first comprehensive analysis of the impact of failures on energy consumption in a real-world large-scale cloud system (comprising over 12 500 servers), including the study of failure and energy trends of the spatial and temporal environmental characteristics. Our results show that 88% of task failure events occur in lower priority tasks producing 13% of total energy waste, and 1% of failure events occur in higher priority tasks due to server failures producing 8% of total energy waste. These results highlight an unintuitive but significant impact on energy consumption due to failures, providing a strong foundation for research into dependable energy-aware cloud computing.
AB - Cloud computing providers are under great pressure to reduce operational costs through improved energy utilization while provisioning dependable service to customers; it is therefore extremely important to understand and quantify the explicit impact of failures within a system in terms of energy costs. This paper presents the first comprehensive analysis of the impact of failures on energy consumption in a real-world large-scale cloud system (comprising over 12 500 servers), including the study of failure and energy trends of the spatial and temporal environmental characteristics. Our results show that 88% of task failure events occur in lower priority tasks producing 13% of total energy waste, and 1% of failure events occur in higher priority tasks due to server failures producing 8% of total energy waste. These results highlight an unintuitive but significant impact on energy consumption due to failures, providing a strong foundation for research into dependable energy-aware cloud computing.
U2 - 10.1109/TETC.2014.2304500
DO - 10.1109/TETC.2014.2304500
M3 - Journal article
VL - 2
SP - 166
EP - 180
JO - IEEE Transactions on Emerging Topics in Computing
JF - IEEE Transactions on Emerging Topics in Computing
SN - 2168-6750
IS - 2
ER -