6533b835fe1ef96bd129fcf0

RESEARCH PRODUCT

Ultra-Fast Detection of Higher-Order Epistatic Interactions on GPUs

Daniel JüngerJorge González-domínguezChristian HundtBertil Schmidt

subject

0301 basic medicineTheoretical computer scienceComputer sciencebusiness.industryContrast (statistics)Genome-wide association study02 engineering and technologyMutual informationMachine learningcomputer.software_genreReduction (complexity)03 medical and health sciences030104 developmental biologyGenetic epidemiology0202 electrical engineering electronic engineering information engineeringEpistasis020201 artificial intelligence & image processingArtificial intelligenceCluster analysisbusinesscomputerGenetic association

description

Detecting higher-order epistatic interactions in Genome-Wide Association Studies (GWAS) remains a challenging task in the fields of genetic epidemiology and computer science. A number of algorithms have recently been proposed for epistasis discovery. However, they suffer from a high computational cost since statistical measures have to be evaluated for each possible combination of markers. Hence, many algorithms use additional filtering stages discarding potentially non-interacting markers in order to reduce the overall number of combinations to be examined. Among others, Mutual Information Clustering (MIC) is a common pre-processing filter for grouping markers into partitions using K-Means clustering. Potentially interacting candidates for high-order epistasis are then examined exhaustively in a subsequent phase. However, analyzing real-world datasets of moderate size can still take several hours when performing analysis on a single CPU. In this work we propose a massively parallel computation scheme for the MIC algorithm targeting CUDA-enabled accelerators. Our implementation is able to perform epistasis discovery using more than 500,000 markers in just a couple of seconds in contrast to several hours when using the sequential MIC implementation. This runtime reduction by two orders-of-magnitude enables fast exploration of higher-order epistatic interactions even in large-scale GWAS datasets.

https://doi.org/10.1007/978-3-319-58943-5_34