Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Smart, adaptive mapping of parallelism in the presence of external workload
AU - Emani, Murali Krishna
AU - Wang, Zheng
AU - O'Boyle, Michael
PY - 2013/2
Y1 - 2013/2
N2 - Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely execute in isolation and have to share the platform with other applications that compete for resources. If the external workload is not considered when mapping a program, it leads to a significant drop in performance. This paper describes an automatic approach that combines compile-time knowledge of the program with dynamic runtime workload information to determine the best adaptive mapping of programs to available resources. This approach delivers increased performance for the target application without penalizing the existing workload. This approach is evaluated on NAS and SpecOMP parallel bench-mark programs across a wide range of workload scenarios. On average, our approach achieves performance gain of 1.5× over a state-of-art scheme on a 12 core machine.
AB - Given the wide scale adoption of multi-cores in main stream computing, parallel programs rarely execute in isolation and have to share the platform with other applications that compete for resources. If the external workload is not considered when mapping a program, it leads to a significant drop in performance. This paper describes an automatic approach that combines compile-time knowledge of the program with dynamic runtime workload information to determine the best adaptive mapping of programs to available resources. This approach delivers increased performance for the target application without penalizing the existing workload. This approach is evaluated on NAS and SpecOMP parallel bench-mark programs across a wide range of workload scenarios. On average, our approach achieves performance gain of 1.5× over a state-of-art scheme on a 12 core machine.
U2 - 10.1109/CGO.2013.6495010
DO - 10.1109/CGO.2013.6495010
M3 - Conference contribution/Paper
SN - 9781467355247
SP - 1
EP - 10
BT - 2013 International Symposium on Code Generation and Optimization (CGO)
PB - IEEE
ER -