Home > Research > Publications & Outputs > A workload-aware mapping approach for data-para...
View graph of relations

A workload-aware mapping approach for data-parallel programs

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

A workload-aware mapping approach for data-parallel programs. / Grewe, Dominik ; Wang, Zheng; O'Boyle, Michael.
HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. New York: ACM, 2011. p. 117-126.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Grewe, D, Wang, Z & O'Boyle, M 2011, A workload-aware mapping approach for data-parallel programs. in HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. ACM, New York, pp. 117-126. https://doi.org/10.1145/1944862.1944881

APA

Grewe, D., Wang, Z., & O'Boyle, M. (2011). A workload-aware mapping approach for data-parallel programs. In HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers (pp. 117-126). ACM. https://doi.org/10.1145/1944862.1944881

Vancouver

Grewe D, Wang Z, O'Boyle M. A workload-aware mapping approach for data-parallel programs. In HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. New York: ACM. 2011. p. 117-126 doi: 10.1145/1944862.1944881

Author

Grewe, Dominik ; Wang, Zheng ; O'Boyle, Michael. / A workload-aware mapping approach for data-parallel programs. HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers. New York : ACM, 2011. pp. 117-126

Bibtex

@inproceedings{19e1da7d369e4a9186000d73a31ca7f0,
title = "A workload-aware mapping approach for data-parallel programs",
abstract = "Much compiler-orientated work in the area of mapping parallel programs to parallel architectures has ignored the issue of external workload. Given that the majority of platforms will not be dedicated to just one task at a time, the impact of other jobs needs to be addressed. As mapping is highly dependent on the underlying machine, a technique that is easily portable across platforms is also desirable.In this paper we develop an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload. We achieve 93.7% of the maximum speedup available which gives an average speedup of 1.66 on 4 cores, a factor 1.24 times better than the OpenMP compiler's default policy. We also develop an alternative cooperative model that minimizes the impact on external workload while still giving an improved average speedup. Finally, we evaluate our approach on a separate 8-core machine giving an average 1.33 times speedup over the default policy showing the portability of our approach.",
author = "Dominik Grewe and Zheng Wang and Michael O'Boyle",
year = "2011",
doi = "10.1145/1944862.1944881",
language = "English",
isbn = "9781450302418",
pages = "117--126",
booktitle = "HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers",
publisher = "ACM",

}

RIS

TY - GEN

T1 - A workload-aware mapping approach for data-parallel programs

AU - Grewe, Dominik

AU - Wang, Zheng

AU - O'Boyle, Michael

PY - 2011

Y1 - 2011

N2 - Much compiler-orientated work in the area of mapping parallel programs to parallel architectures has ignored the issue of external workload. Given that the majority of platforms will not be dedicated to just one task at a time, the impact of other jobs needs to be addressed. As mapping is highly dependent on the underlying machine, a technique that is easily portable across platforms is also desirable.In this paper we develop an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload. We achieve 93.7% of the maximum speedup available which gives an average speedup of 1.66 on 4 cores, a factor 1.24 times better than the OpenMP compiler's default policy. We also develop an alternative cooperative model that minimizes the impact on external workload while still giving an improved average speedup. Finally, we evaluate our approach on a separate 8-core machine giving an average 1.33 times speedup over the default policy showing the portability of our approach.

AB - Much compiler-orientated work in the area of mapping parallel programs to parallel architectures has ignored the issue of external workload. Given that the majority of platforms will not be dedicated to just one task at a time, the impact of other jobs needs to be addressed. As mapping is highly dependent on the underlying machine, a technique that is easily portable across platforms is also desirable.In this paper we develop an approach for predicting the optimal number of threads for a given data-parallel application in the presence of external workload. We achieve 93.7% of the maximum speedup available which gives an average speedup of 1.66 on 4 cores, a factor 1.24 times better than the OpenMP compiler's default policy. We also develop an alternative cooperative model that minimizes the impact on external workload while still giving an improved average speedup. Finally, we evaluate our approach on a separate 8-core machine giving an average 1.33 times speedup over the default policy showing the portability of our approach.

U2 - 10.1145/1944862.1944881

DO - 10.1145/1944862.1944881

M3 - Conference contribution/Paper

SN - 9781450302418

SP - 117

EP - 126

BT - HiPEAC '11 Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers

PB - ACM

CY - New York

ER -