Search results for "Xeon"

showing 10 items of 20 documents

Pairwise DNA Sequence Alignment Optimization

2015

This chapter presents a parallel implementation of the Smith-Waterman algorithm to accelerate the pairwise alignment of DNA sequences. This algorithm is especially computationally demanding for long DNA sequences. Parallelization approaches are examined in order to deeply explore the inherent parallelism within Intel Xeon Phi coprocessors. This chapter looks at exploiting instruction-level parallelism within 512-bit single instruction multiple data instructions (vectorization) as well as thread-level parallelism over the many cores (multithreading using OpenMP). Between coprocessors, device-level parallelism through the compute power of clusters including Intel Xeon Phi coprocessors using M…

CoprocessorComputer scienceMultithreadingVectorization (mathematics)Parallelism (grammar)SIMDParallel computingHardware_ARITHMETICANDLOGICSTRUCTURESComputerSystemsOrganization_PROCESSORARCHITECTURESIntrinsicsInstruction-level parallelismXeon Phi
researchProduct

Bit-Parallel Approximate Pattern Matching on the Xeon Phi Coprocessor

2014

Bit-parallel pattern matching encodes calculated values in bit arrays. This approach gains its efficiency by performing multiple updates within a machine word. An important parameter is therefore the machine word size (e.g. 32 or 64 bits). With the increasing length of vector registers, the efficient mapping of bit-parallel pattern matching algorithms onto modern high performance computing architectures is becoming increasingly important. In this paper, we investigate an efficient implementation of the Wu-Manber approximate pattern matching algorithm on the Intel Xeon Phi coprocessor. This architecture features a 512-bit long vector processing unit (VPU) as well as a large number of process…

Instruction setCoprocessorSpeedupComputer scienceParallel computingPattern matchingIntrinsicsWord (computer architecture)Xeon PhiVector processor2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing
researchProduct

SWAPHI-LS: Smith-Waterman Algorithm on Xeon Phi coprocessors for Long DNA Sequences

2014

As an optimal method for sequence alignment, the Smith-Waterman (SW) algorithm is widely used. Unfortunately, this algorithm is computationally demanding, especially for long sequences. This has motivated the investigation of its acceleration on a variety of high-performance computing platforms. However, most work in the literature is only suitable for short sequences. In this paper, we present SWAPHI-LS, the first parallel SW algorithm exploiting emerging Xeon Phi coprocessors to accelerate the alignment of long DNA sequences. In SWAPHI-LS, we have investigated three parallelization approaches (naive, tiled, and distributed) in order to deeply explore the inherent parallelism within Xeon P…

Instruction setSmith–Waterman algorithmCoprocessorXeonComputer scienceData parallelismTask parallelismParallel computingSIMDIntrinsicsInstruction-level parallelismXeon Phi2014 IEEE International Conference on Cluster Computing (CLUSTER)
researchProduct

XLCS: A New Bit-Parallel Longest Common Subsequence Algorithm on Xeon Phi Clusters

2019

Finding the longest common subsequence (LCS) of two strings is a classical problem in bioinformatics. A basic approach to solve this problem is based on dynamic programming. As the biological sequence databases are growing continuously, bit-parallel sequence comparison algorithms are becoming increasingly important. In this paper, we present XLCS, a new parallel implementation to accelerate the LCS algorithm on Xeon Phi clusters by performing bit-wise operations. We have designed an asynchronous IO framework to improve the data transfer efficiency. To make full use of the computing resources of Xeon Phi clusters, we use three levels of parallelism: node-level, thread-level and vector-level.…

Longest common subsequence problemDynamic programmingSpeedupComputer scienceComputer clusterAsynchronous I/OCacheSupercomputerAlgorithmXeon Phi2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)
researchProduct

SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors

2014

The maximal sensitivity of the Smith-Waterman (SW) algorithm has enabled its wide use in biological sequence database search. Unfortunately, the high sensitivity comes at the expense of quadratic time complexity, which makes the algorithm computationally demanding for big databases. In this paper, we present SWAPHI, the first parallelized algorithm employing Xeon Phi coprocessors to accelerate SW protein database search. SWAPHI is designed based on the scale-and-vectorize approach, i.e. it boosts alignment speed by effectively utilizing both the coarse-grained parallelism from the many co-processing cores (scale) and the fine-grained parallelism from the 512-bit wide single instruction, mul…

Smith–Waterman algorithmFOS: Computer and information sciencesMulti-core processorCoprocessorSpeedupSequence databaseComputer scienceParallel computingIntrinsicsComputer Science - Distributed Parallel and Cluster ComputingScalabilitySIMDDistributed Parallel and Cluster Computing (cs.DC)Xeon Phi
researchProduct

Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures

2015

In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Fur…

Smith–Waterman algorithmXeonComputer scienceData parallelismHyper-threadingSIMDParallel computingCentral processing unitComputerSystemsOrganization_PROCESSORARCHITECTURESIntrinsicsXeon Phi2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
researchProduct

The Dynamical Kernel Scheduler - Part 1

2015

Emerging processor architectures such as GPUs and Intel MICs provide a huge performance potential for high performance computing. However developing software using these hardware accelerators introduces additional challenges for the developer such as exposing additional parallelism, dealing with different hardware designs and using multiple development frameworks in order to use devices from different vendors. The Dynamic Kernel Scheduler (DKS) is being developed in order to provide a software layer between host application and different hardware accelerators. DKS handles the communication between the host and device, schedules task execution, and provides a library of built-in algorithms. …

Speedup010308 nuclear & particles physicsComputer sciencebusiness.industryFast Fourier transformGeneral Physics and AstronomyFOS: Physical sciencesParallel computingComputational Physics (physics.comp-ph)Supercomputer01 natural sciencesCUDASoftwareKernel (image processing)Hardware and Architecture0103 physical sciencesHardware acceleration010306 general physicsbusinessPhysics - Computational PhysicsXeon Phi
researchProduct

Optimization of Reactive Force Field Simulation: Refactor, Parallelization, and Vectorization for Interactions

2022

Molecular dynamics (MD) simulations are playing an increasingly important role in many areas ranging from chemical materials to biological molecules. With the continuing development of MD models, the potentials are getting larger and more complex. In this article, we focus on the reactive force field (ReaxFF) potential from LAMMPS to optimize the computation of interactions. We present our efforts on refactoring for neighbor list building, bond order computation, as well as valence angles and torsion angles computation. After redesigning these kernels, we develop a vectorized implementation for non-bonded interactions, which is nearly $100 \times$ 100 × faster than the management processing…

SpeedupComputational Theory and MathematicsXeonHardware and ArchitectureComputer scienceComputationSignal ProcessingVectorization (mathematics)Node (circuits)Parallel computingSupercomputerForce field (chemistry)Sunway TaihuLightIEEE Transactions on Parallel and Distributed Systems
researchProduct

BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures

2018

Abstract Motivation Modern bioinformatics tools for analyzing large-scale NGS datasets often need to include fast implementations of core sequence alignment algorithms in order to achieve reasonable execution times. We address this need by presenting the BGSA toolkit for optimized implementations of popular bit-parallel global pairwise alignment algorithms on modern microprocessors. Results BGSA outperforms Edlib, SeqAn and BitPAl for pairwise edit distance computations and Parasail, SeqAn and BitPAl when using more general scoring schemes for pairwise alignments of a batch of sequence reads on both standard multi-core CPUs and Xeon Phi many-core CPUs. Furthermore, banded edit distance perf…

Statistics and Probability0303 health sciencesMulti-core processorXeonComputer sciencebusiness.industry030302 biochemistry & molecular biologySequence alignmentSequence Analysis DNAParallel computingBiochemistryComputer Science Applications03 medical and health sciencesComputational MathematicsTitan (supercomputer)SoftwareComputational Theory and MathematicsEdit distancebusinessSequence AlignmentMolecular BiologyAlgorithmsSoftwareXeon Phi030304 developmental biologyBioinformatics
researchProduct

FPGA-based Acceleration of Detecting Statistical Epistasis in GWAS

2014

Abstract Genotype-by-genotype interactions (epistasis) are believed to be a significant source of unexplained genetic variation causing complex chronic diseases but have been ignored in genome-wide association studies (GWAS) due to the computational burden of analysis. In this work we show how to benefit from FPGA technology for highly parallel creation of contingency tables in a systolic chain with a subsequent statistical test. We present the implementation for the FPGA-based hardware platform RIVYERA S6-LX150 containing 128 Xilinx Spartan6-LX150 FPGAs. For performance evaluation we compare against the method iLOCi[9]. iLOCi claims to outperform other available tools in terms of accuracy.…

epistasis020203 distributed computing0303 health sciencesXeonWorkstationComputer scienceGenome-wide association study02 engineering and technologycomputer.software_genrelaw.inventioncontingency tables03 medical and health sciencesAccelerationFPGA technologylaw0202 electrical engineering electronic engineering information engineeringGeneral Earth and Planetary SciencesEpistasisGWASData miningpairwise gene-gene interactionField-programmable gate arraycomputer030304 developmental biologyGeneral Environmental ScienceProcedia Computer Science
researchProduct