Achtung:

Sie haben Javascript deaktiviert!
Sie haben versucht eine Funktion zu nutzen, die nur mit Javascript möglich ist. Um sämtliche Funktionalitäten unserer Internetseite zu nutzen, aktivieren Sie bitte Javascript in Ihrem Browser.

Data Center Building O Show image information

Data Center Building O

Automated Code Acceleration with Compilation to OpenCL

Multi-accelerator platforms consist of a diverse set of accelerators and are capable of processing parallel workloads very efficiently. However, this requires applications to be ported to various accelerators using different programming languages, models and tools. Additionally, developers also need to understand the low-level accelerator details, leading to an increase in the design effort and costs.

Overview of the architecture and the tool flow.

To tackle this challenge, we propose HTrOP, a compilation approach and prototypical implementation. HTrOP is able to automatically analyze a sequential CPU application, detect computational hotspots and generate parallel OpenCL host and kernel code. The potential is demonstrated by offloading hotspots to different OpenCL-enabled resources (currently CPU, GPGPU and the manycore Intel Xeon Phi). Our contribution includes:

  1. Automatic transformation of suitable data-parallel loops into independent OpenCL-typical work-items that are executed in parallel.

  2. A two-layered approach of identifying hotspots at compile time and refining offloading decisions at runtime based on parameters like input sizes, availability of accelerators, etc.

  3. Infrastructure for offloading to and migrating between accelerators, while minimizing data transfer overheads by reusing data though application-specific, generated code parts.

  4. A thorough evaluation of performance gains and energy savings with different accelerator targets, taking into account one-time and recurring overheads introduced by our approach. The evaluation includes a comparison to handwritten pragma-based OpenACC code for multicore CPUs and GPUs.

Source Code

The source code of our prototype implementation is available at github.com/pc2/htrop.

Publications


Open list in Research Information System

Transparent Acceleration for Heterogeneous Platforms with Compilation to OpenCL (to appear)

H. Riebler, G.F. Vaz, T. Kenter, C. Plessl, ACM Trans. Archit. Code Optim. (TACO) (2019)


Automated Code Acceleration Targeting Heterogeneous OpenCL Devices

H. Riebler, G.F. Vaz, T. Kenter, C. Plessl, in: Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), ACM, 2018

DOI


Open list in Research Information System

 

Keywords

Transparent Acceleration; Runtime System; Runtime Decision; Multi-Accelerator; OpenCL; OpenACC; Offloading; Migration; Performance and Energy; Software and its engineering; Runtime environments; Incremental compilers; Computing methodologies; Parallel programming languages; Computer systems organization; Heterogeneous systems; Heterogeneous (hybrid) systems; Accelerator Programming; Hotspot Detection; Code Generation; Code Generation Decision; Parallel Kernel Code; Data Optimization;

Contact

Dr. Heinrich Riebler

Paderborn Center for Parallel Computing (PC2)

Research Associate

Heinrich Riebler
Phone:
+49 5251 60-5382
Fax:
+49 5251 60-1714
Office:
O3.152
Web:

Dr. Tobias Kenter

Paderborn Center for Parallel Computing (PC2)

Scientific Advisor FPGA Acceleration

Phone:
+49 5251 60-4340
Fax:
+49 5251 60-1714
Office:
O2.161

Prof. Dr. Christian Plessl

Paderborn Center for Parallel Computing (PC2)

Christian Plessl
Phone:
+49 5251 60-5399
Fax:
+49 5251 60-1714
Office:
O2.167
Web:

Office hours:

In winter term 2019/2020 the consultation hour for students is Tuesdays from 2:00-3:00 pm.

The University for the Information Society