0000000000403096

AUTHOR

Haidong Lan

showing 7 related works from this author

Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters

2016

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data par…

0301 basic medicineXeon Phi clustersComputer scienceData parallelismParallel algorithm02 engineering and technologyDynamic programmingBiochemistryPairwise sequence alignmentComputational science03 medical and health sciencesStructural BiologyComputer cluster0202 electrical engineering electronic engineering information engineeringAmino Acid SequenceDatabases ProteinMolecular Biology020203 distributed computingResearchApplied MathematicsComputational BiologyProteinsSmith-WatermanComputer Science Applications030104 developmental biologyMultiple sequence alignmentDatabases Nucleic AcidSequence AlignmentAlgorithmsSoftwareXeon PhiBMC Bioinformatics
researchProduct

RabbitQC: high-speed scalable quality control for sequencing data

2019

Abstract Motivation Modern sequencing technologies continue to revolutionize many areas of biology and medicine. Since the generated datasets are error-prone, downstream applications usually require quality control methods to pre-process FASTQ files. However, existing tools for this task are currently not able to fully exploit the capabilities of computing platforms leading to slow runtimes. Results We present RabbitQC, an extremely fast integrated quality control tool for FASTQ files, which can take full advantage of modern hardware. It includes a variety of operations and supports different sequencing technologies (Illumina, Oxford Nanopore and PacBio). RabbitQC achieves speedups between …

Quality ControlStatistics and ProbabilityFASTQ formatDownstream (software development)Exploitmedia_common.quotation_subjectBiochemistryNanopores03 medical and health sciencesSoftwareQuality (business)Molecular Biology030304 developmental biologymedia_common0303 health sciencesbusiness.industry030302 biochemistry & molecular biologyHigh-Throughput Nucleotide SequencingSequence Analysis DNAComputer Science ApplicationsComputational MathematicsTask (computing)Computational Theory and MathematicsComputer architectureScalabilityNanopore sequencingbusinessSoftwareBioinformatics
researchProduct

FeatherCNN: Fast Inference Computation with TensorGEMM on ARM Architectures

2020

Deep Learning is ubiquitous in a wide field of applications ranging from research to industry. In comparison to time-consuming iterative training of convolutional neural networks (CNNs), inference is a relatively lightweight operation making it amenable to execution on mobile devices. Nevertheless, lower latency and higher computation efficiency are crucial to allow for complex models and prolonged battery life. Addressing the aforementioned challenges, we propose FeatherCNN – a fast inference library for ARM CPUs – targeting the performance ceiling of mobile devices. FeatherCNN employs three key techniques: 1) A highly efficient TensorGEMM (generalized matrix multiplication) routine is app…

020203 distributed computingSource codeIterative methodComputer sciencebusiness.industrymedia_common.quotation_subjectDeep learningInference02 engineering and technologyParallel computingConvolutional neural networkMatrix multiplicationARM architectureComputational Theory and MathematicsHardware and ArchitectureSignal Processing0202 electrical engineering electronic engineering information engineeringArtificial intelligencebusinessmedia_commonIEEE Transactions on Parallel and Distributed Systems
researchProduct

BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures

2018

Abstract Motivation Modern bioinformatics tools for analyzing large-scale NGS datasets often need to include fast implementations of core sequence alignment algorithms in order to achieve reasonable execution times. We address this need by presenting the BGSA toolkit for optimized implementations of popular bit-parallel global pairwise alignment algorithms on modern microprocessors. Results BGSA outperforms Edlib, SeqAn and BitPAl for pairwise edit distance computations and Parasail, SeqAn and BitPAl when using more general scoring schemes for pairwise alignments of a batch of sequence reads on both standard multi-core CPUs and Xeon Phi many-core CPUs. Furthermore, banded edit distance perf…

Statistics and Probability0303 health sciencesMulti-core processorXeonComputer sciencebusiness.industry030302 biochemistry & molecular biologySequence alignmentSequence Analysis DNAParallel computingBiochemistryComputer Science Applications03 medical and health sciencesComputational MathematicsTitan (supercomputer)SoftwareComputational Theory and MathematicsEdit distancebusinessSequence AlignmentMolecular BiologyAlgorithmsSoftwareXeon Phi030304 developmental biologyBioinformatics
researchProduct

Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures

2015

In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Fur…

Smith–Waterman algorithmXeonComputer scienceData parallelismHyper-threadingSIMDParallel computingCentral processing unitComputerSystemsOrganization_PROCESSORARCHITECTURESIntrinsicsXeon Phi2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
researchProduct

PUNAS: A Parallel Ungapped-Alignment-Featured Seed Verification Algorithm for Next-Generation Sequencing Read Alignment

2017

The progress of next-generation sequencing has a major impact on medical and genomic research. This technology can now produce billions of short DNA fragments (reads) in a single run. One of the most demanding computational problems used by almost every sequencing pipeline is short-read alignment; i.e. determining where each fragment originated from in the original genome. Most current solutions are based on a seed-and-extend approach, where promising candidate regions (seeds) are first identified and subsequently extended in order to verify whether a full high-scoring alignment actually exists in the vicinity of each seed. Seed verification is the main bottleneck in many state-of-the-art a…

chemistry.chemical_compoundSpeedupchemistryComputer scienceGenomicsParallel computingComputational problemGenomeAlgorithmDNA sequencingDNA2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
researchProduct

SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search

2017

Computer architectures continue to develop rapidly towards massively parallel and heterogeneous systems. Thus, easily extensible yet highly efficient parallelization approaches for a variety of platforms are urgently needed. In this paper, we present SWhybrid, a hybrid computing framework for large-scale biological sequence database search on heterogeneous computing environments with multi-core or many-core processing units (PUs) based on the Smith- Waterman (SW) algorithm. To incorporate a diverse set of PUs such as combinations of CPUs, GPUs and Xeon Phis, we abstract them as SIMD vector execution units with different number of lanes. We propose a machine model, associated with a unified …

0301 basic medicineXeonSequence databasebusiness.industryComputer scienceInterface (computing)Symmetric multiprocessor systemParallel computingSet (abstract data type)03 medical and health sciences030104 developmental biologySoftwareComputer architectureSIMDbusinessMassively parallel2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
researchProduct