Search results for "graphics processing units"

showing 10 items of 21 documents

Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures

2016

This is a post-peer-review, pre-copyedit version of an article published in IEEE Transactions on Parallel and Distributed Systems. The final authenticated version is available online at: http://dx.doi.org/10.1109/TPDS.2015.2460247. [Abstract] Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on diseases. As these studies are time consuming operations, some tools exploit the characteristics of different hardware accelerators (such as GPUs and Xeon Phi coprocessors) to reduce the runtime. Nevertheless, all these approaches are not able t…

0301 basic medicineCoprocessorComputer science0206 medical engineeringAccelerationData modelsSymmetric multiprocessor systemComputational modeling02 engineering and technologyParallel computingSupercomputer03 medical and health sciencesTask (computing)030104 developmental biologyCoprocessorsComputational Theory and MathematicsHardware and ArchitectureSignal ProcessingGeneticsPairwise comparisonComputer architectureGraphics processing units020602 bioinformaticsXeon Phi

researchProduct

GSaaS: A Service to Cloudify and Schedule GPUs

2018

Cloud technology is an attractive infrastructure solution that provides customers with an almost unlimited on-demand computational capacity using a pay-per-use approach, and allows data centers to increase their energy and economic savings by adopting a virtualized resource sharing model. However, resources such as graphics processing units (GPUs), have not been fully adapted to this model. Although, general-purpose computing on graphics processing units (GPGPU) is becoming more and more popular, cloud providers lack of flexibility to manage accelerators, because of the extended use of peripheral component interconnect (PCI) passthrough techniques to attach GPUs to virtual machines (VMs). F…

0301 basic medicineScheduleGeneral Computer ScienceComputer scienceDistributed computingnetworkingCloud computing02 engineering and technologycomputer.software_genre03 medical and health sciencesGPU resource management020204 information systems0202 electrical engineering electronic engineering information engineeringCloud computingGeneral Materials ScienceResource managementplatform virtualizationbusiness.industrycloud computingGeneral EngineeringVirtualizationShared resource030104 developmental biologyVirtual machineScalabilityGPU cloudificationlcsh:Electrical engineering. Electronics. Nuclear engineeringGeneral-purpose computing on graphics processing unitsbusinesscomputerlcsh:TK1-9971IEEE Access

researchProduct

GPU-Based Optimisation of 3D Sensor Placement Considering Redundancy, Range and Field of View

2020

This paper presents a novel and efficient solution for the 3D sensor placement problem based on GPU programming and massive parallelisation. Compared to prior art using gradient-search and mixed-integer based approaches, the method presented in this paper returns optimal or good results in a fraction of the time compared to previous approaches. The presented method allows for redundancy, i.e. requiring selected sub-volumes to be covered by at least n sensors. The presented results are for 3D sensors which have a visible volume represented by cones, but the method can easily be extended to work with sensors having other range and field of view shapes, such as 2D cameras and lidars.

0303 health sciences030306 microbiologyComputer scienceVolume (computing)020207 software engineeringField of view02 engineering and technology3d sensor03 medical and health sciencesRange (mathematics)CUDAComputer engineering0202 electrical engineering electronic engineering information engineeringRedundancy (engineering)Fraction (mathematics)General-purpose computing on graphics processing units2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA)

researchProduct

CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing

2014

We present CUSHAW2-GPU to accelerate the CUSHAW2 algorithm using compute unified device architecture (CUDA)-enabled GPUs. Two critical GPU computing techniques, namely intertask hybrid CPU-GPU parallelism and tile-based Smith-Waterman map backtracking using CUDA, are investigated to facilitate fast alignments. By aligning both simulated and real reads to the human genome, our aligner yields comparable or better performance compared to BWA-SW, Bowtie2, and GEM. Furthermore, CUSHAW2-GPU with a Tesla K20c GPU achieves significant speedups over the multithreaded CUSHAW2, BWA-SW, Bowtie2, and GEM on the 12 cores of a high-end CPU for both single-end and paired-end alignment.

BacktrackingComputer scienceParallel computingSoftware_PROGRAMMINGTECHNIQUESShort readComputational scienceCUDAParallel processing (DSP implementation)Hardware and ArchitectureParallelism (grammar)Electrical and Electronic EngineeringGeneral-purpose computing on graphics processing unitsSoftwareComputingMethodologies_COMPUTERGRAPHICSIEEE Design & Test

researchProduct

Parallelizing Epistasis Detection in GWAS on FPGA and GPU-Accelerated Computing Systems

2015

This is a post-peer-review, pre-copyedit version of an article published in IEEE - ACM Transactions on Computational Biology and Bioinformatics. The final authenticated version is available online at: http://dx.doi.org/10.1109/TCBB.2015.2389958 [Abstract] High-throughput genotyping technologies (such as SNP-arrays) allow the rapid collection of up to a few million genetic markers of an individual. Detecting epistasis (based on 2-SNP interactions) in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. Computational methods to detect epistasis therefore suffer from prohibitively lon…

Computer scienceBioinformaticsDNA Mutational AnalysisGenome-wide association studyParallel computingPolymorphism Single NucleotideSensitivity and SpecificityComputational biologyComputer GraphicsGeneticsComputer architectureField-programmable gate arrayRandom access memoryApplied MathematicsChromosome MappingHigh-Throughput Nucleotide SequencingReproducibility of ResultsField programmable gate arraysEpistasis GeneticSignal Processing Computer-AssistedEquipment DesignRandom access memoryComputing systemsReconfigurable computingEquipment Failure AnalysisTask (computing)EpistasisHost (network)Graphics processing unitsGenome-Wide Association StudyBiotechnology

researchProduct

Towards an Efficient Implementation of an Accurate SPH Method

2020

A modified version of the Smoothed Particle Hydrodynamics (SPH) method is considered in order to overcome the loss of accuracy of the standard formulation. The summation of Gaussian kernel functions is employed, using the Improved Fast Gauss Transform (IFGT) to reduce the computational cost, while tuning the desired accuracy in the SPH method. This technique, coupled with an algorithmic design for exploiting the performance of Graphics Processing Units (GPUs), makes the method promising, as shown by numerical experiments.

Computer scienceGauss transformOrder (ring theory)Smoothed Particle Hydrodynamics Improved Fast Gauss Transform Graphics Processing UnitsSmoothed-particle hydrodynamicsSmoothed Particle Hydrodynamicssymbols.namesakeImproved Fast Gauss TransformGaussian functionsymbolsAlgorithm designGraphics Processing UnitsGraphicsAlgorithmComputingMethodologies_COMPUTERGRAPHICS

researchProduct

Three-dimensional Fuzzy Kernel Regression framework for registration of medical volume data

2013

Abstract In this work a general framework for non-rigid 3D medical image registration is presented. It relies on two pattern recognition techniques: kernel regression and fuzzy c-means clustering. The paper provides theoretic explanation, details the framework, and illustrates its application to implement three registration algorithms for CT/MR volumes as well as single 2D slices. The first two algorithms are landmark-based approaches, while the third one is an area-based technique. The last approach is based on iterative hierarchical volume subdivision, and maximization of mutual information. Moreover, a high performance Nvidia CUDA based implementation of the algorithm is presented. The f…

Computer sciencebusiness.industryImage registrationMutual informationMachine learningcomputer.software_genreFuzzy logicCUDANon-rigid registration Fuzzy regression Mutual information Interpolation GPU computingArtificial IntelligenceSignal ProcessingPattern recognition (psychology)Kernel regressionComputer Vision and Pattern RecognitionArtificial intelligenceData miningGeneral-purpose computing on graphics processing unitsCluster analysisbusinesscomputerSoftwareInterpolationPattern Recognition

researchProduct

Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations of the 2D Ising Model

2010

A Modern Graphics Processing unit (GPU) is able to perform massively parallel scientific computations at low cost. We extend our implementation of the checkerboard algorithm for the two-dimensional Ising model [T. Preis et al., Journal of Chemical Physics 228 (2009) 4468–4477] in order to overcome the memory limitations of a single GPU which enables us to simulate significantly larger systems. Using multi-spin coding techniques, we are able to accelerate simulations on a single GPU by factors up to 35 compared to an optimized single Central Processor Unit (CPU) core implementation which employs multi-spin coding. By combining the Compute Unified Device Architecture (CUDA) with the Message P…

FOS: Computer and information sciencesComputer scienceMonte Carlo methodGraphics processing unitFOS: Physical sciencesGeneral Physics and AstronomyMathematical Physics (math-ph)Parallel computingGPU clusterComputational Physics (physics.comp-ph)Graphics (cs.GR)Computational scienceCUDAComputer Science - GraphicsHardware and ArchitectureIsing modelCentral processing unitGeneral-purpose computing on graphics processing unitsMassively parallelPhysics - Computational PhysicsMathematical Physics

researchProduct

SIMULATING SPIN MODELS ON GPU: A TOUR

2012

The use of graphics processing units (GPUs) in scientific computing has gathered considerable momentum in the past five years. While GPUs in general promise high performance and excellent performance per Watt ratios, not every class of problems is equally well suitable for exploiting the massively parallel architecture they provide. Lattice spin models appear to be prototypic examples of problems suitable for this architecture, at least as long as local update algorithms are employed. In this review, I summarize our recent experience with the simulation of a wide range of spin models on GPU employing an equally wide range of update algorithms, ranging from Metropolis and heat bath updates,…

Heat bathComputer scienceMonte Carlo methodGeneral Physics and AstronomyStatistical and Nonlinear PhysicsMassively parallel architectureRangingParallel computingComputer Science ApplicationsComputational Theory and MathematicsGeneral-purpose computing on graphics processing unitsGraphicsArchitectureMathematical PhysicsPerformance per wattInternational Journal of Modern Physics C

researchProduct

LightSpMV: Faster CSR-based sparse matrix-vector multiplication on CUDA-enabled GPUs

2015

Compressed sparse row (CSR) is a frequently used format for sparse matrix storage. However, the state-of-the-art CSR-based sparse matrix-vector multiplication (SpMV) implementations on CUDA-enabled GPUs do not exhibit very high efficiency. This has motivated the development of some alternative storage formats for GPU computing. Unfortunately, these alternatives are incompatible with most CPU-centric programs and require dynamic conversion from CSR at runtime, thus incurring significant computational and storage overheads. We present LightSpMV, a novel CUDA-compatible SpMV algorithm using the standard CSR format, which achieves high speed by benefiting from the fine-grained dynamic distribut…

Instruction setCUDASpeedupComputer scienceSparse matrix-vector multiplicationDouble-precision floating-point formatParallel computingGeneral-purpose computing on graphics processing unitsRowSparse matrix2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

researchProduct