FusionCL - Research Portal | Lancaster University

Home > Research > Publications & Outputs > FusionCL

Vice-Chancellor's Office

Text available via DOI:

https://doi.org/10.1007/s00607-021-00958-2
Final published version

Keywords

Scheduling, Kernel fusion, High-performance computing, Machine learning

View graph of relations

FusionCL: A Machine-Learning Based Approach for OpenCL Kernel Fusion to Increase System Performance

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Yasir Noman Khalid
Muhammad Aleem
Usman Ahmed
Radu Prodan
Muhammad Arshad Islam
Muhammad Azhar Iqbal

More...

<mark>Journal publication date</mark>	31/10/2021
<mark>Journal</mark>	Computing
Issue number	10
Volume	103
Number of pages	32
Pages (from-to)	2171-2202
Publication Status	Published
Early online date	3/06/21
<mark>Original language</mark>	English

Abstract

Employing general-purpose graphics processing units (GPGPU) with the help of OpenCL has resulted in greatly reducing the execution time of data-parallel applications by taking advantage of the massive available parallelism. However, when a small data size application is executed on GPU there is a wastage of GPU resources as the application cannot fully utilize GPU compute-cores. There is no mechanism to share a GPU between two kernels due to the lack of operating system support on GPU. In this paper, we propose the provision of a GPU sharing mechanism between two kernels that will lead to increasing GPU occupancy, and as a result, reduce execution time of a job pool. However, if a pair of the kernel is competing for the same set of resources (i.e., both applications are compute-intensive or memory-intensive), kernel fusion may also result in a significant increase in execution time of fused kernels. Therefore, it is pertinent to select an optimal pair of kernels for fusion that will result in significant speedup over their serial execution. This research presents FusionCL, a machine learning-based GPU sharing mechanism between a pair of OpenCL kernels. FusionCL identifies each pair of kernels (from the job pool), which are suitable candidates for fusion using a machine learning-based fusion suitability classifier. Thereafter, from all the candidates, it selects a pair of candidate kernels that will produce maximum speedup after fusion over their serial execution using a fusion speedup predictor. The experimental evaluation shows that the proposed kernel fusion mechanism reduces execution time by 2.83× when compared to a baseline scheduling scheme. When compared to state-of-the-art, the reduction in execution time is up to 8%.

Research

Links

Text available via DOI:

Keywords

FusionCL: A Machine-Learning Based Approach for OpenCL Kernel Fusion to Increase System Performance

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us