Home > Research > Publications & Outputs > Trimmer

Electronic data

  • Trimmer - Borowiec (CLOUD 22)

    Rights statement: ©2022 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

    Accepted author manuscript, 1.04 MB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters. / Borowiec, Damian; Yeung, Ging-Fung; Friday, Adrian et al.
Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022. IEEE, 2022. p. 374-384.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Borowiec, D, Yeung, G-F, Friday, A, Harper, RHR & Garraghan, P 2022, Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters. in Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022. IEEE, pp. 374-384, 15th IEEE International Conference on Cloud Computing, CLOUD 2022, Barcelona, Spain, 10/07/21. https://doi.org/10.1109/CLOUD55607.2022.00061

APA

Vancouver

Borowiec D, Yeung GF, Friday A, Harper RHR, Garraghan P. Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters. In Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022. IEEE. 2022. p. 374-384 Epub 2022 Jul 10. doi: 10.1109/CLOUD55607.2022.00061

Author

Borowiec, Damian ; Yeung, Ging-Fung ; Friday, Adrian et al. / Trimmer : Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters. Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022. IEEE, 2022. pp. 374-384

Bibtex

@inproceedings{f5467d84dbb5413aaf87b85a22ec830c,
title = "Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters",
abstract = "Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26-43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8-40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques. ",
keywords = "Deep Learning, Cloud datacenter, MLaaS, Machine Learning systems, Energy, Sustainable AI",
author = "Damian Borowiec and Ging-Fung Yeung and Adrian Friday and R.H.R. Harper and Peter Garraghan",
note = "{\textcopyright}2022 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. ; 15th IEEE International Conference on Cloud Computing, CLOUD 2022 ; Conference date: 10-07-2021 Through 16-07-2021",
year = "2022",
month = aug,
day = "24",
doi = "10.1109/CLOUD55607.2022.00061",
language = "English",
isbn = "9781665481380",
pages = "374--384",
booktitle = "Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022",
publisher = "IEEE",

}

RIS

TY - GEN

T1 - Trimmer

T2 - 15th IEEE International Conference on Cloud Computing, CLOUD 2022

AU - Borowiec, Damian

AU - Yeung, Ging-Fung

AU - Friday, Adrian

AU - Harper, R.H.R.

AU - Garraghan, Peter

N1 - ©2022 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PY - 2022/8/24

Y1 - 2022/8/24

N2 - Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26-43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8-40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.

AB - Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26-43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8-40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.

KW - Deep Learning

KW - Cloud datacenter

KW - MLaaS

KW - Machine Learning systems

KW - Energy

KW - Sustainable AI

U2 - 10.1109/CLOUD55607.2022.00061

DO - 10.1109/CLOUD55607.2022.00061

M3 - Conference contribution/Paper

SN - 9781665481380

SP - 374

EP - 384

BT - Proceedings - 2022 IEEE 15th International Conference on Cloud Computing, CLOUD 2022

PB - IEEE

Y2 - 10 July 2021 through 16 July 2021

ER -