Search results for "GPU"

showing 3 items of 43 documents

Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model

2015

[Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, reducing the runtime for moderately-sized datasets to less than 1 hour. However, single-GPU approaches cannot perform large-scale GWAS in reasonable time. In this work we present multiEpistSearch, a tool to detect epistasis that works on GPU clusters. While CUDA is used for parallelization within each GPU, the workload distribution among GPUs is performed with Unified Parallel C++ (UPC++), a novel extension of C++ that follows the Partitioned Global Address Space (PGAS) model…

Scale (ratio)BioinformaticsComputer sciencePGASGPUCUDAGenome-wide association studyParallel computingGPU clusterSoftware_PROGRAMMINGTECHNIQUESTheoretical Computer ScienceComputational scienceCUDAHardware and ArchitectureUnified Parallel CProgramming paradigmPartitioned global address spacecomputerUPC++Softwarecomputer.programming_languageThe International Journal of High Performance Computing Applications
researchProduct

Architecture-Driven Level Set Optimization: From Clustering to Sub-pixel Image Segmentation

2016

Thanks to their effectiveness, active contour models (ACMs) are of great interest for computer vision scientists. The level set methods (LSMs) refer to the class of geometric active contours. Comparing with the other ACMs, in addition to subpixel accuracy, it has the intrinsic ability to automatically handle topological changes. Nevertheless, the LSMs are computationally expensive. A solution for their time consumption problem can be hardware acceleration using some massively parallel devices such as graphics processing units (GPUs). But the question is: which accuracy can we reach while still maintaining an adequate algorithm to massively parallel architecture? In this paper, we attempt to…

Level set methodComputer science0211 other engineering and technologiesInitialization02 engineering and technology[ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processingLevel setgraphics processing units0202 electrical engineering electronic engineering information engineeringLevel set methodComputer visionElectrical and Electronic EngineeringCluster analysisMassively parallelimage segmentation021101 geological & geomatics engineeringActive contour modelhybrid CPU-GPU architecturebusiness.industryImage segmentationSubpixel renderingComputer Science ApplicationsHuman-Computer InteractionControl and Systems EngineeringHardware acceleration020201 artificial intelligence & image processingArtificial intelligencebusiness[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processingSoftwareInformation Systems
researchProduct

Perfect Hashing Structures for Parallel Similarity Searches

2015

International audience; Seed-based heuristics have proved to be efficient for studying similarity between genetic databases with billions of base pairs. This paper focuses on algorithms and data structures for the filtering phase in seed-based heuristics, with an emphasis on efficient parallel GPU/manycores implementa- tion. We propose a 2-stage index structure which is based on neighborhood indexing and perfect hashing techniques. This structure performs a filtering phase over the neighborhood regions around the seeds in constant time and avoid as much as possible random memory accesses and branch divergences. Moreover, it fits particularly well on parallel SIMD processors, because it requ…

parallelismSimilarity (geometry)OpenCLComputer scienceseed-based heuristicsHash functionSearch engine indexingGPUParallel computingData structureperfect hash functionPattern matchingSIMD[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]read mapperHeuristicsPerfect hash function2015 IEEE International Parallel and Distributed Processing Symposium Workshop
researchProduct