Performance Optimization on big.LITTLE Architectures - Research Portal

Computing and Communications

Associated organisational unit

Distributed Systems

Electronic data

lctes
Rights statement: © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in LCTES '20: The 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, (2020) https://dl.acm.org/doi/10.1145/3372799.3394370
Accepted author manuscript, 0.98 MB, PDF document
Available under license: Unspecified

Text available via DOI:

https://doi.org/10.1145/3372799.3394370
Final published version

View graph of relations

Performance Optimization on big.LITTLE Architectures: A Memory-latency Aware Approach

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

More...

Publication date	1/06/2020
Host publication	LCTES '20: The 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems
Place of Publication	New York
Publisher	ACM
Pages	51–61
Number of pages	11
ISBN (print)	9781450370943
<mark>Original language</mark>	English

Abstract

The energy demands of modern mobile devices have driven a trend towards heterogeneous multi-core systems which include various types of core tuned for performance or energy efficiency, offering a rich optimization space for software. On such systems, data coherency between cores is automatically ensured by an interconnect between processors. On some chip designs the performance of this interconnect, and by extension of the entire CPU cluster, is highly dependent on the software's memory access characteristics and on the set of frequencies of each CPU core. Existing frequency scaling mechanisms in operating systems use a simple load-based heuristic to tune CPU frequencies, and so fail to achieve a holistically good configuration across such diverse clusters. We propose a new adaptive governor to solve this problem, which uses a simple trained hardware model of cache interconnect characteristics, along with real-time hardware monitors, to continually adjust core frequencies to maximize system performance. We evaluate our governor on the Exynos5422 SoC, as used in the Samsung Galaxy S5, across a range of standard benchmarks. This shows that our approach achieves a speedup of up to 40%, and a 70% energy saving, including a 30% speedup in common mobile applications such as video decoding and web browsing.

Bibliographic note

© ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in LCTES '20: The 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, (2020) https://dl.acm.org/doi/10.1145/3372799.3394370

Research

Associated organisational unit

Electronic data

Links

Text available via DOI:

Performance Optimization on big.LITTLE Architectures: A Memory-latency Aware Approach

Abstract

Bibliographic note

Quick Links

Connect With Us

Faculties & Depts

Contact Us