6533b874fe1ef96bd12d6350
RESEARCH PRODUCT
Perfect Hashing Structures for Parallel Similarity Searches
Mathieu GiraudJean-stéphane VarréTuan Tu Transubject
parallelismSimilarity (geometry)OpenCLComputer scienceseed-based heuristicsHash functionSearch engine indexingGPUParallel computingData structureperfect hash functionPattern matchingSIMD[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]read mapperHeuristicsPerfect hash functiondescription
International audience; Seed-based heuristics have proved to be efficient for studying similarity between genetic databases with billions of base pairs. This paper focuses on algorithms and data structures for the filtering phase in seed-based heuristics, with an emphasis on efficient parallel GPU/manycores implementa- tion. We propose a 2-stage index structure which is based on neighborhood indexing and perfect hashing techniques. This structure performs a filtering phase over the neighborhood regions around the seeds in constant time and avoid as much as possible random memory accesses and branch divergences. Moreover, it fits particularly well on parallel SIMD processors, because it requires intensive but homogeneous computational operations. Using this data structure, we developed a fast and sensitive OpenCL prototype read mapper.
year | journal | country | edition | language |
---|---|---|---|---|
2015-05-01 | 2015 IEEE International Parallel and Distributed Processing Symposium Workshop |