Accelerator specific OpenCL backend for LLVM

In a nutshell

Improve our existing code generator ( LLVM IR --> OpenCL) regarding

Add accelerator-specific optimizations (OpenCL for GPU, OpenCL for FPGA, OpenCL for Xeon Phi etc.)
Optimize for specific system requirements: performance, energy, etc. by guiding backend
Combine with information provided by polyhedral optimization
Compare results to hand-written code

Summary

Nowadays systems are becoming heterogeneous (CPU+GPU, CPU+FPGA, etc.) This opens a vector with multiple acceleration opportunities of applications. However, programming different accelerators is cumbersome, time-consuming and error prone. To automate the process, we make use of the LLVM compiler infrastructure. We analyse an application for computational intensive parts (hotspots), optimize the hotspots and finally just-in-time (JIT) generate accelerated OpenCL code suitable for various heterogeneous resources (GPUs, FPGAs, etc.).

The Open Computing Language (OpenCL) provides a standard interface for parallel computing using task- and data-based parallelism, which can be executed across different heterogeneous devices (CPUs, GPUs, MICs and FPGAs). As we want to target multiple accelerators and adopt a “write program code once, run anywhere” approach, we make use of a basic OpenCL backend for our LLVM tool flow. The backend generates functionally portable OpenCL code which can be run on different accelerators. As the underlying architecture of such accelerators differ, the performance of this OpenCL code differs when compared to accelerator-specific OpenCL code.

The first goal of this thesis would be to extend this basic OpenCL backend to consider accelerator-specific optimizations, as well as the systems performance/energy requirements to generate optimized OpenCL code. Additionally, for the sake of simplicity, the current OpenCL backend makes some assumptions regarding the loop structure of the input code which limits the applicability of our approach. Polyhedral techniques have been used to detect hotspot structures, perform automatic parallelization, data locality optimizations, etc. The second goal of this thesis is to investigate how polyhedral techniques can be used to improve the current OpenCL backend to support a wider range of loop structures and improve the performance of the generated OpenCL code.

We provide

Interesting, research related topics
Heterogeneous server with multiple state-of-the-art soft-/hardware
Good work atmosphere
Optionally: work area in student lab

You should bring

Good C/C++ knowledge
Motivated, able to work autonomously
Optimally: OpenCL experience
Optimally: LLVM experience

Dr. Heinrich Riebler

Paderborn Center for Parallel Computing (PC2)

Scientific Advisor FPGA Acceleration

Write email +49 5251 60-5382

More about the person