Home > Research > Publications & Outputs > Exploitation of GPUs for the parallelisation of...
View graph of relations

Exploitation of GPUs for the parallelisation of probably parallel legacy code

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Exploitation of GPUs for the parallelisation of probably parallel legacy code. / Wang, Zheng; Powell, Daniel; Franke, Bjorn et al.
Compiler construction: 23rd International Conference, CC 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings. ed. / Albert Cohen. Berlin: Springer Verlag, 2014. p. 154-173 (Lecture Notes in Computer Science ; Vol. 8409).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Wang, Z, Powell, D, Franke, B & O'Boyle, M 2014, Exploitation of GPUs for the parallelisation of probably parallel legacy code. in A Cohen (ed.), Compiler construction: 23rd International Conference, CC 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings. Lecture Notes in Computer Science , vol. 8409, Springer Verlag, Berlin, pp. 154-173. https://doi.org/10.1007/978-3-642-54807-9_9

APA

Wang, Z., Powell, D., Franke, B., & O'Boyle, M. (2014). Exploitation of GPUs for the parallelisation of probably parallel legacy code. In A. Cohen (Ed.), Compiler construction: 23rd International Conference, CC 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings (pp. 154-173). (Lecture Notes in Computer Science ; Vol. 8409). Springer Verlag. https://doi.org/10.1007/978-3-642-54807-9_9

Vancouver

Wang Z, Powell D, Franke B, O'Boyle M. Exploitation of GPUs for the parallelisation of probably parallel legacy code. In Cohen A, editor, Compiler construction: 23rd International Conference, CC 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings. Berlin: Springer Verlag. 2014. p. 154-173. (Lecture Notes in Computer Science ). doi: 10.1007/978-3-642-54807-9_9

Author

Wang, Zheng ; Powell, Daniel ; Franke, Bjorn et al. / Exploitation of GPUs for the parallelisation of probably parallel legacy code. Compiler construction: 23rd International Conference, CC 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5-13, 2014. Proceedings. editor / Albert Cohen. Berlin : Springer Verlag, 2014. pp. 154-173 (Lecture Notes in Computer Science ).

Bibtex

@inproceedings{a7d74cc88c0444ae9b6b44d06258c982,
title = "Exploitation of GPUs for the parallelisation of probably parallel legacy code",
abstract = "General purpose Gpus provide massive compute power, but are notoriously difficult to program. In this paper we present a complete compilation strategy to exploit Gpus for the parallelisation of sequential legacy code. Using hybrid data dependence analysis combining static and dynamic information, our compiler automatically detects suitable parallelism and generates parallel OpenCl code from sequential programs. We exploit the fact that dependence profiling provides us with parallel loop candidates that are highly likely to be genuinely parallel, but cannot be statically proven so. For the efficient Gpu parallelisation of those probably parallel loop candidates, we propose a novel software speculation scheme, which ensures correctness for the unlikely, yet possible case of dynamically detected dependence violations. Our scheme operates in place and supports speculative read and write operations. We demonstrate the effectiveness of our approach in detecting and exploiting parallelism using sequential codes from the Nas benchmark suite. We achieve an average speedup of 3.2x, and up to 99x, over the sequential baseline. On average, this is 1.42 times faster than state-of-the-art speculation schemes and corresponds to 99% of the performance level of a manual Gpu implementation developed by independent expert programmers.",
keywords = "GPU, OpenCL, Parallelization, Thread Level Speculation",
author = "Zheng Wang and Daniel Powell and Bjorn Franke and Michael O'Boyle",
year = "2014",
doi = "10.1007/978-3-642-54807-9_9",
language = "English",
isbn = "9783642548062",
series = "Lecture Notes in Computer Science ",
publisher = "Springer Verlag",
pages = "154--173",
editor = "Albert Cohen",
booktitle = "Compiler construction",

}

RIS

TY - GEN

T1 - Exploitation of GPUs for the parallelisation of probably parallel legacy code

AU - Wang, Zheng

AU - Powell, Daniel

AU - Franke, Bjorn

AU - O'Boyle, Michael

PY - 2014

Y1 - 2014

N2 - General purpose Gpus provide massive compute power, but are notoriously difficult to program. In this paper we present a complete compilation strategy to exploit Gpus for the parallelisation of sequential legacy code. Using hybrid data dependence analysis combining static and dynamic information, our compiler automatically detects suitable parallelism and generates parallel OpenCl code from sequential programs. We exploit the fact that dependence profiling provides us with parallel loop candidates that are highly likely to be genuinely parallel, but cannot be statically proven so. For the efficient Gpu parallelisation of those probably parallel loop candidates, we propose a novel software speculation scheme, which ensures correctness for the unlikely, yet possible case of dynamically detected dependence violations. Our scheme operates in place and supports speculative read and write operations. We demonstrate the effectiveness of our approach in detecting and exploiting parallelism using sequential codes from the Nas benchmark suite. We achieve an average speedup of 3.2x, and up to 99x, over the sequential baseline. On average, this is 1.42 times faster than state-of-the-art speculation schemes and corresponds to 99% of the performance level of a manual Gpu implementation developed by independent expert programmers.

AB - General purpose Gpus provide massive compute power, but are notoriously difficult to program. In this paper we present a complete compilation strategy to exploit Gpus for the parallelisation of sequential legacy code. Using hybrid data dependence analysis combining static and dynamic information, our compiler automatically detects suitable parallelism and generates parallel OpenCl code from sequential programs. We exploit the fact that dependence profiling provides us with parallel loop candidates that are highly likely to be genuinely parallel, but cannot be statically proven so. For the efficient Gpu parallelisation of those probably parallel loop candidates, we propose a novel software speculation scheme, which ensures correctness for the unlikely, yet possible case of dynamically detected dependence violations. Our scheme operates in place and supports speculative read and write operations. We demonstrate the effectiveness of our approach in detecting and exploiting parallelism using sequential codes from the Nas benchmark suite. We achieve an average speedup of 3.2x, and up to 99x, over the sequential baseline. On average, this is 1.42 times faster than state-of-the-art speculation schemes and corresponds to 99% of the performance level of a manual Gpu implementation developed by independent expert programmers.

KW - GPU

KW - OpenCL

KW - Parallelization

KW - Thread Level Speculation

U2 - 10.1007/978-3-642-54807-9_9

DO - 10.1007/978-3-642-54807-9_9

M3 - Conference contribution/Paper

SN - 9783642548062

T3 - Lecture Notes in Computer Science

SP - 154

EP - 173

BT - Compiler construction

A2 - Cohen, Albert

PB - Springer Verlag

CY - Berlin

ER -