Home > Research > Publications & Outputs > Holistic energy and failure aware workload sche...


Text available via DOI:

View graph of relations

Holistic energy and failure aware workload scheduling in Cloud datacenters

Research output: Contribution to journalJournal article

<mark>Journal publication date</mark>01/2018
<mark>Journal</mark>Future Generation Computer Systems
Issue number3
Number of pages14
Pages (from-to)887-900
Early online date22/07/17
<mark>Original language</mark>English


The global uptake of Cloud computing has attracted increased interest within both academia and industry resulting in the formation of large-scale and complex distributed systems. This has led to increased failure occurrence within computing systems that induce substantial negative impact upon system performance and task reliability perceived by users. Such systems also consume vast quantities of power, resulting in significant operational costs perceived by providers. Virtualization – a commonly deployed technology within Cloud datacenters – can enable flexible scheduling of virtual machines to maximize system reliability and energy-efficiency. However, existing work address these two objectives separately, providing limited understanding towards studying the explicit trade-offs towards dependable and energy-efficient compute infrastructure. In this paper, we propose two failure-aware energy-efficient scheduling algorithms that exploit the holistic operational characteristics of the Cloud datacenter comprising the cooling unit, computing infrastructure and server failures. By comprehensively modeling the power and failure profiles of a Cloud datacenter, we propose workload scheduling algorithms Ella-W and Ella-B, capable of reducing cooling and compute energy while minimizing the impact of system failures. A novel and overall metric is proposed that combines energy efficiency and reliability to specify the performance of various algorithms. We evaluate our algorithms against Random, MaxUtil, TASA, MTTE and OBFIT under various system conditions of failure prediction accuracy and workload intensity. Evaluation results demonstrate that Ella-W can reduce energy usage by 29.5% and improve task completion rate by 3.6%, while Ella-B reduces energy usage by 32.7% with no degradation to task completion rate.

Bibliographic note

This is the author’s version of a work that was accepted for publication in Future Generation Computer Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Future Generation Computer Systems, 78, 3, 2017 DOI: 10.1016/j.future.2017.07.044