6533b872fe1ef96bd12d2db3

RESEARCH PRODUCT

Massively Parallel Huffman Decoding on GPUs

Bertil SchmidtAndré Weißenberger

subject

020203 distributed computingComputer sciencebusiness.industryDeep learning020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingHuffman codingsymbols.namesakeCUDATitan (supercomputer)0202 electrical engineering electronic engineering information engineeringsymbolsArtificial intelligencebusinessMassively parallelData compression

description

Data compression is a fundamental building block in a wide range of applications. Besides its intended purpose to save valuable storage on hard disks, compression can be utilized to increase the effective bandwidth to attached storage as realized by state-of-the-art file systems. In the foreseeing future, on-the-fly compression and decompression will gain utmost importance for the processing of data-intensive applications such as streamed Deep Learning tasks or Next Generation Sequencing pipelines, which establishes the need for fast parallel implementations. Huffman coding is an integral part of a number of compression methods. However, efficient parallel implementation of Huffman decompression is difficult due to inherent data dependencies (i.e. the location of a decoded symbol depends on its predecessors). In this paper, we present the first massively parallel decoder implementation that is compatible with Huffman's original method by taking advantage of the self-synchronization property of Huffman codes. Our performance evaluation on three different CUDA-enabled GPUs (TITAN V, TITAN XP, GTX 1080) demonstrates speedups of over one order-of-magnitude compared to the state-of-art CPU-based Zstandard Huffman decoder. Our implementation is available at https://github.com/weissenberger/gpuhd.

https://doi.org/10.1145/3225058.3225076