6533b830fe1ef96bd1296fa4

RESEARCH PRODUCT

Extending PluTo for Multiple Devices by Integrating OpenACC

Tim SubTunahan KayaDustin Feld

subject

060201 languages & linguisticsMulti-core processorExploitComputer scienceClock rate06 humanities and the arts02 engineering and technologyParallel computingUSablecomputer.software_genrePluto0602 languages and literature0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingCompilercomputer

description

For many years now, processor vendors increased the performance of their devices by adding more cores and wider vectorization units to their CPUs instead of scaling up the processors' clock frequency. Moreover, GPUs became popular for solving problems with even more parallel compute power. To exploit the full potential of modern compute devices, specific codes are necessary which are often coded in a hardware-specific manner. Usually, the codes for CPUs are not usable for GPUs and vice versa. The programming API OpenACC tries to close this gap by enabling one code-base to be suitable and optimized for many devices. Nevertheless, OpenACC is rarely used by `standard programmers' and while different code transformers (like PluTo) allow for (semi-)automatic code parallelization for multi-core CPUs, they do generally not support OpenACC yet. We present first promising results of our PluTo extension that generates parallelized codes using OpenACC. Using our transformer we create programs which exploit the parallelism of different platforms without any manual modifications and we achieve performance speedups of up to 100 in comparison to the original unoptimized programs and accelations of 2.05 in comparison to equally generated OpenMP codes.

https://doi.org/10.1109/pdp2018.2018.00049