Search results for "GPU"

showing 10 items of 43 documents

Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations of the 2D Ising Model

2010

A Modern Graphics Processing unit (GPU) is able to perform massively parallel scientific computations at low cost. We extend our implementation of the checkerboard algorithm for the two-dimensional Ising model [T. Preis et al., Journal of Chemical Physics 228 (2009) 4468–4477] in order to overcome the memory limitations of a single GPU which enables us to simulate significantly larger systems. Using multi-spin coding techniques, we are able to accelerate simulations on a single GPU by factors up to 35 compared to an optimized single Central Processor Unit (CPU) core implementation which employs multi-spin coding. By combining the Compute Unified Device Architecture (CUDA) with the Message P…

FOS: Computer and information sciencesComputer scienceMonte Carlo methodGraphics processing unitFOS: Physical sciencesGeneral Physics and AstronomyMathematical Physics (math-ph)Parallel computingGPU clusterComputational Physics (physics.comp-ph)Graphics (cs.GR)Computational scienceCUDAComputer Science - GraphicsHardware and ArchitectureIsing modelCentral processing unitGeneral-purpose computing on graphics processing unitsMassively parallelPhysics - Computational PhysicsMathematical Physics

researchProduct

Design exploration of aes accelerators on FPGAS and GPUs

2017

The embedded systems are increasingly becoming a key technological component of all kinds of complex tech-nical systems and an exhaustive analysis of the state of the art of all current performance with respect to architectures, design methodologies, test and applications could be very in-teresting. The Advanced Encryption Standard (AES), based on the well-known algorithm Rijndael, is designed to be easily implemented in hardware and software platforms. General purpose computing on graphics processing unit (GPGPU) is an alternative to recongurable accelerators based on FPGA devices. This paper presents a direct comparison between FPGA and GPU used as accelerators for the AES cipher. The res…

AESOpenCLGPGPUAcceleratorFPGA prototyping

researchProduct

Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems

2012

SUMMARY The block Wiedemann (BW) algorithm is frequently used to solve sparse linear systems over GF(2). Iterative sparse matrix–vector multiplication is the most time-consuming operation. The necessity to accelerate this step is motivated by the application of BW to very large matrices used in the linear algebra step of the number field sieve (NFS) for integer factorization. In this paper, we derive an efficient CUDA implementation of this operation by using a newly designed hybrid sparse matrix format. This leads to speedups between 4 and 8 on a single graphics processing unit (GPU) for a number of tested NFS matrices compared with an optimized multicore implementation. We further present…

Block Wiedemann algorithmComputer Networks and CommunicationsComputer scienceGraphics processing unitSparse matrix-vector multiplicationGPU clusterParallel computingGF(2)Computer Science ApplicationsTheoretical Computer ScienceGeneral number field sieveMatrix (mathematics)Computational Theory and MathematicsFactorizationLinear algebraMultiplicationComputer Science::Operating SystemsSoftwareInteger factorizationSparse matrixConcurrency and Computation: Practice and Experience

researchProduct

First Experiences on an Accurate SPH Method on GPUs

2017

It is well known that the standard formulation of the Smoothed Particle Hydrodynamics is usually poor when scattered data distribution is considered or when the approximation near the boundary occurs. Moreover, the method is computational demanding when a high number of data sites and evaluation points are employed. In this paper an enhanced version of the method is proposed improving the accuracy and the efficiency by using a HPC environment. Our implementation exploits the processing power of GPUs for the basic computational kernel resolution. The performance gain demonstrates the method to be accurate and suitable to deal with large sets of data.

SpeedupExploitGPUsComputer scienceComputer Networks and CommunicationsGPUSmoothed Particle Hydrodynamics method010103 numerical & computational mathematics01 natural sciencesComputational scienceSmoothed-particle hydrodynamicsInstruction setSettore MAT/08 - Analisi NumericaArtificial IntelligenceAccuracy; Approximation; GPUs; Kernel function; Smoothed particle hydrodynamics method; Speed-Up; Artificial Intelligence; Computer Networks and Communications; 1707; Signal Processing0101 mathematicsApproximationAccuracy1707Random access memoryLinear systemKernel functionSpeed-Up010101 applied mathematicsKernel (statistics)Signal Processing

researchProduct

GSaaS: A Service to Cloudify and Schedule GPUs

2018

Cloud technology is an attractive infrastructure solution that provides customers with an almost unlimited on-demand computational capacity using a pay-per-use approach, and allows data centers to increase their energy and economic savings by adopting a virtualized resource sharing model. However, resources such as graphics processing units (GPUs), have not been fully adapted to this model. Although, general-purpose computing on graphics processing units (GPGPU) is becoming more and more popular, cloud providers lack of flexibility to manage accelerators, because of the extended use of peripheral component interconnect (PCI) passthrough techniques to attach GPUs to virtual machines (VMs). F…

0301 basic medicineScheduleGeneral Computer ScienceComputer scienceDistributed computingnetworkingCloud computing02 engineering and technologycomputer.software_genre03 medical and health sciencesGPU resource management020204 information systems0202 electrical engineering electronic engineering information engineeringCloud computingGeneral Materials ScienceResource managementplatform virtualizationbusiness.industrycloud computingGeneral EngineeringVirtualizationShared resource030104 developmental biologyVirtual machineScalabilityGPU cloudificationlcsh:Electrical engineering. Electronics. Nuclear engineeringGeneral-purpose computing on graphics processing unitsbusinesscomputerlcsh:TK1-9971IEEE Access

researchProduct

GPU-accelerated exhaustive search for third-order epistatic interactions in case–control studies

2015

This is a post-peer-review, pre-copyedit version of an article published in Journal of Computational Science. The final authenticated version is available online at: https://doi.org/10.1016/j.jocs.2015.04.001 [Abstract] Interest in discovering combinations of genetic markers from case–control studies, such as Genome Wide Association Studies (GWAS), that are strongly associated to diseases has increased in recent years. Detecting epistasis, i.e. interactions among k markers (k ≥ 2), is an important but time consuming operation since statistical computations have to be performed for each k-tuple of measured markers. Efficient exhaustive methods have been proposed for k = 2, but exhaustive thi…

Theoretical computer scienceSource codeGeneral Computer ScienceComputer scienceComputationmedia_common.quotation_subjectGPUBrute-force searchCUDAMutual informationcomputer.software_genreTheoretical Computer ScienceMutual informationCUDAModeling and SimulationEpistasisGWASNode (circuits)Data miningTupleHeuristicscomputermedia_commonJournal of Computational Science

researchProduct

An Efficient Implementation of Parallel Parametric HRTF Models for Binaural Sound Synthesis in Mobile Multimedia

2020

The extended use of mobile multimedia devices in applications like gaming, 3D video and audio reproduction, immersive teleconferencing, or virtual and augmented reality, is demanding efficient algorithms and methodologies. All these applications require real-time spatial audio engines with the capability of dealing with intensive signal processing operations while facing a number of constraints related to computational cost, latency and energy consumption. Most mobile multimedia devices include a Graphics Processing Unit (GPU) that is primarily used to accelerate video processing tasks, providing high computational capabilities due to its inherent parallel architecture. This paper describes…

interpolation.General Computer Scienceparallel filtersComputer scienceGPUGpuGraphics processing unitLatency (audio)Parametric model02 engineering and technologycomputer.software_genre030507 speech-language pathology & audiology03 medical and health sciencesSoftware portabilityHRTF modeling0202 electrical engineering electronic engineering information engineeringGeneral Materials ScienceMultimediaparametric modelGeneral EngineeringTeleconferenceBinaural synthesis020206 networking & telecommunicationsVideo processingEnergy consumptioninterpolationInterpolationHrtf modelingScalabilityParallel filtersElectrónicaAugmented realitylcsh:Electrical engineering. Electronics. Nuclear engineering0305 other medical sciencelcsh:TK1-9971Mobile devicecomputerIEEE Access

researchProduct

Real-time data processing in the ALICE High Level Trigger at the LHC

2019

At the Large Hadron Collider at CERN in Geneva, Switzerland, atomic nuclei are collided at ultra-relativistic energies. Many final-state particles are produced in each collision and their properties are measured by the ALICE detector. The detector signals induced by the produced particles are digitized leading to data rates that are in excess of 48 GB/$s$. The ALICE High Level Trigger (HLT) system pioneered the use of FPGA- and GPU-based algorithms to reconstruct charged-particle trajectories and reduce the data size in real time. The results of the reconstruction of the collision events, available online, are used for high level data quality and detector-performance monitoring and real-tim…

calibration ; ALICE ; trigger ; monitoring ; quality ; data management ; programming ; FPGA ; multiprocessor: graphics ; performancePhysics - Instrumentation and DetectorsHigh level triggerPhysics::Instrumentation and DetectorsLevel datatutkimuslaitteetFPGA; GPUDetector calibrationGPUFOS: Physical sciencesGeneral Physics and AstronomyhiukkasfysiikkaPhysics and Astronomy(all)01 natural sciencesprogramming010305 fluids & plasmasCombinatoricsALICE0103 physical sciencesmultiprocessor: graphics[INFO]Computer Science [cs][PHYS.PHYS.PHYS-INS-DET]Physics [physics]/Physics [physics]/Instrumentation and Detectors [physics.ins-det]Detectors and Experimental Techniques010306 general physicsNuclear Experimentphysics.ins-detFPGAcomputer.programming_languagePhysicsLarge Hadron ColliderFPGA; GPU; TRACKsignaalinkäsittelyInstrumentation and Detectors (physics.ins-det)triggercalibrationmonitoringdatailmaisimetqualityHardware and ArchitectureTRACKHigh Energy Physics::Experimentdata managementAlice (programming language)computerperformance

researchProduct

A GPU-accelerated augmented Lagrangian based L1-mean curvature Image denoising algorithm implementation

2015

This paper presents a graphics processing unit (GPU) implementation of a recently published augmented Lagrangian based L1-mean curvature image denoising algorithm. The algorithm uses a particular alternating direction method of multipliers to reduce the related saddle-point problem to an iterative sequence of four simpler minimization problems. Two of these subproblems do not contain the derivatives of the unknown variables and can therefore be solved point-wise without inter-process communication. Inparticular, this facilitates the efficient solution of the subproblem that deals with the non-convex term in the original objective function by modern GPUs. The two remaining subproblems are so…

GPU výpočtyOpenCLimage denoisingodstranění šumu z obrazumean curvaturekuvankäsittelystřední zakřiveníaugmented Lagrangian methodGPU computingzpracování obrazurozšířená Lagrangianova metodaimage processing

researchProduct

WiseEye: A Platform to Manage and Experiment on Smart Camera Networks

2016

International audience; Embedded vision is probably at the edge of phenomenal expansion. The smart cameras are embedding some processing units which are more and more powerful. Last decade, high-speed image processing can be implemented on specifically designed architectures [1] nevertheless the designing time of such systems was quite high and time to market therefore as well. Since, powerful chips (i.e System On Chip) and quick prototyping methodologies are contently emerging [2],[3],[4] and enable more complex algorithms to be implemented faster. Moreover, smart cameras which are embedding flexible and powerful multi-core processors or Graphic Processors Unit (GPU) are now available and …

Real-time Image processingfall detectionSmart CameraMulti-core processorGPUsmart building[INFO.INFO-ES]Computer Science [cs]/Embedded Systems[ INFO.INFO-ES ] Computer Science [cs]/Embedded Systemscontrol accessphotopletysmography[INFO.INFO-ES] Computer Science [cs]/Embedded Systems

researchProduct