Search results for "GPU"

showing 10 items of 43 documents

Future Processor Hardware Architectures for the Benefit of Precise Particle Accelerator Modeling

2017

Jaunās procesoru arhitektūras, kā grafiskie procesori (GPU) un Intel Many Integrated Cores (MIC) procesori, sniedz milzīgu veiktspējas potenciālu augstas veiktspējas skaitļošanas aplikācijās. Tomēr izstrādājot programmatūru, kas spēj izmantot šīs jaunās tehnoloģijas ir jāsaskarās ar dažādiem papildus grūtībām. Programmām ir jāspēj izmantot papildus paralēlisms, ko piedāvā šīs iekārtās, tām ir jāspēj pielāgoties dažādām procesoru arhitektūrām un jāizmanto dažādas izstrādes platformas, lai aplikācija spēdu darboties uz iekārtām no dažādiem ražotājiem. Dynamic Kernel Scheduler (DKS) tika izstrādāts, lai nodrošinātu papildus programmatūras slāni starp programmu un papildus processoriem. DKS nod…

DatorzinātnesGPU computingComputer science
researchProduct

Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations of the 2D Ising Model

2010

A Modern Graphics Processing unit (GPU) is able to perform massively parallel scientific computations at low cost. We extend our implementation of the checkerboard algorithm for the two-dimensional Ising model [T. Preis et al., Journal of Chemical Physics 228 (2009) 4468–4477] in order to overcome the memory limitations of a single GPU which enables us to simulate significantly larger systems. Using multi-spin coding techniques, we are able to accelerate simulations on a single GPU by factors up to 35 compared to an optimized single Central Processor Unit (CPU) core implementation which employs multi-spin coding. By combining the Compute Unified Device Architecture (CUDA) with the Message P…

FOS: Computer and information sciencesComputer scienceMonte Carlo methodGraphics processing unitFOS: Physical sciencesGeneral Physics and AstronomyMathematical Physics (math-ph)Parallel computingGPU clusterComputational Physics (physics.comp-ph)Graphics (cs.GR)Computational scienceCUDAComputer Science - GraphicsHardware and ArchitectureIsing modelCentral processing unitGeneral-purpose computing on graphics processing unitsMassively parallelPhysics - Computational PhysicsMathematical Physics
researchProduct

Optical sectioning microscopy through single-shot Lightfield protocol

2020

Optical sectioning microscopy is usually performed by means of a scanning, multi-shot procedure in combination with non-uniform illumination. In this paper, we change the paradigm and report a method that is based in the light field concept, and that provides optical sectioning for 3D microscopy images after a single-shot capture. To do this we fi rst capture multiple orthographic perspectives of the sample by means of Fourier-domain integral microscopy (FiMic). The second stage of our protocol is the application of a novel refocusing algorithm that is able to produce optical sectioning in real time, and with no resolution worsening, in the case of sparse f luorescent samples.We provide the…

FiMicGeneral Computer ScienceOptical sectioningComputer scienceComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION02 engineering and technology3d microscopy01 natural sciences010309 opticsOptics0103 physical sciencesMicroscopyGeneral Materials ScienceProtocol (object-oriented programming)Fourier integral microscopebusiness.industryResolution (electron density)Orthographic projectionGeneral EngineeringSingle shotfourier lightfield microscopeGPU computingÒptica021001 nanoscience & nanotechnologySample (graphics)Microscòpialightfield microscopeoptical sectioninglcsh:Electrical engineering. Electronics. Nuclear engineering0210 nano-technologybusinesslcsh:TK1-9971
researchProduct

A GPU-accelerated augmented Lagrangian based L1-mean curvature Image denoising algorithm implementation

2015

This paper presents a graphics processing unit (GPU) implementation of a recently published augmented Lagrangian based L1-mean curvature image denoising algorithm. The algorithm uses a particular alternating direction method of multipliers to reduce the related saddle-point problem to an iterative sequence of four simpler minimization problems. Two of these subproblems do not contain the derivatives of the unknown variables and can therefore be solved point-wise without inter-process communication. Inparticular, this facilitates the efficient solution of the subproblem that deals with the non-convex term in the original objective function by modern GPUs. The two remaining subproblems are so…

GPU výpočtyOpenCLimage denoisingodstranění šumu z obrazumean curvaturekuvankäsittelystřední zakřiveníaugmented Lagrangian methodGPU computingzpracování obrazurozšířená Lagrangianova metodaimage processing
researchProduct

Advanced numerical treatment of an accurate SPH method

2019

The summation of Gaussian kernel functions is an expensive operation frequently encountered in scientific simulation algorithms and several methods have been already proposed to reduce its computational cost. In this work, the Improved Fast Gauss Transform (IFGT) [1] is properly applied to the Smoothed Particle Hydrodynamics (SPH) method [2] in order to speed up its efficiency. A modified version of the SPH method is considered in order to overcome the loss of accuracy of the standard formulation [3]. A suitable use of the IFGT allows us to reduce the computational effort while tuning the desired accuracy into the SPH framework. This technique, coupled with an algorithmic design for exploit…

GPUs.SPHIFGT
researchProduct

A prospect for computing in porous materials research: Very large fluid flow simulations

2016

Abstract Properties of porous materials, abundant both in nature and industry, have broad influences on societies via, e.g. oil recovery, erosion, and propagation of pollutants. The internal structure of many porous materials involves multiple scales which hinders research on the relation between structure and transport properties: typically laboratory experiments cannot distinguish contributions from individual scales while computer simulations cannot capture multiple scales due to limited capabilities. Thus the question arises how large domain sizes can in fact be simulated with modern computers. This question is here addressed using a realistic test case; it is demonstrated that current …

General Computer ScienceComputer scienceLattice Boltzmann method0208 environmental biotechnologyGPULattice Boltzmann methods02 engineering and technologyParallel computing01 natural sciencesPermeability010305 fluids & plasmasTheoretical Computer ScienceComputational sciencePorous materialPetascale computing0103 physical sciencesFluid dynamicsFluid flow simulationPorosityta113ta114Supercomputer020801 environmental engineeringAddressing modePermeability (earth sciences)Petascale computingModeling and SimulationPorous mediumJournal of Computational Science
researchProduct

GPU-laskennan optimointi

2013

Näytönohjaimet, grafiikkasuorittimet, tarjoavat rinnakkaisen laskennan alustan, jossa voidaan suorittaa ohjelmakoodia satojen ydinten toimesta. Tämä alusta mahdollistaa matemaattisesti työläiden ongelmien ratkaisemisen tehokkaasti. Grafiikkasuorittimen rinnakkainen suoritusympäristö kuitenkin eroaa suuresti tietokoneen suorittimen peräkkäisestä suoritusympäristöstä. Ongelmien ratkaisemiseksi tehokkaasti rinnakkaisympäristössä on noudettava ohjelmointimenetelmiä, jotka soveltuvat erityisesti rinnakkaisympäristöön. Tässä työssä tarkastellaan rinnakkaisen laskennan perusteita, miten erilaiset ohjelmointimenetelmät vaikuttavat ohjelman suoriutumiseen grafiikkasuorittimella sekä miten voidaan sa…

Graphics processing unitnäytönohjaimetoptimointinäytönohjainparallel computingGPUrinnakkainen laskentaGrafiikkasuoritinCUDAohjelmointioptimization
researchProduct

Architecture-Driven Level Set Optimization: From Clustering to Sub-pixel Image Segmentation

2016

Thanks to their effectiveness, active contour models (ACMs) are of great interest for computer vision scientists. The level set methods (LSMs) refer to the class of geometric active contours. Comparing with the other ACMs, in addition to subpixel accuracy, it has the intrinsic ability to automatically handle topological changes. Nevertheless, the LSMs are computationally expensive. A solution for their time consumption problem can be hardware acceleration using some massively parallel devices such as graphics processing units (GPUs). But the question is: which accuracy can we reach while still maintaining an adequate algorithm to massively parallel architecture? In this paper, we attempt to…

Level set methodComputer science0211 other engineering and technologiesInitialization02 engineering and technology[ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processingLevel setgraphics processing units0202 electrical engineering electronic engineering information engineeringLevel set methodComputer visionElectrical and Electronic EngineeringCluster analysisMassively parallelimage segmentation021101 geological & geomatics engineeringActive contour modelhybrid CPU-GPU architecturebusiness.industryImage segmentationSubpixel renderingComputer Science ApplicationsHuman-Computer InteractionControl and Systems EngineeringHardware acceleration020201 artificial intelligence & image processingArtificial intelligencebusiness[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processingSoftwareInformation Systems
researchProduct

CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions

2013

Background The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. Results We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU …

Methodology ArticleGPUCUDASoftware_PROGRAMMINGTECHNIQUESBiochemistryComputer Science ApplicationsSmith-WatermanConcurrent executionSequence Analysis ProteinPTX SIMD instructionsDatabases ProteinMolecular BiologySequence AlignmentAlgorithmsSoftwareBMC Bioinformatics
researchProduct

Real-time Sound Source Localization on Graphics Processing Units

2013

Abstract Sound source localization is an important topic in microphone array signal processing applications, such as camera steering systems, human-machine interaction or surveillance systems. The Steered Response Power with Phase Transform (SRP- PHAT) algorithm is one of the most well-known approaches for sound source localization due to its good performance in noisy and reverberant environments. The algorithm analyzes the sound power captured by a microphone array on a grid of spatial points in a given room. While localization accuracy can be improved by using a high resolution spatial grid and a high number of microphones, performing the localization task in these circumstances requires …

Microphone arrayCoprocessorComputer sciencebusiness.industryAudio ProcessingGPUMicrophone ArraysAcoustic source localizationSound powerGridcomputer.software_genreSound Source LocalizationComputational scienceGeneral Earth and Planetary SciencesGraphicsbusinessAudio signal processingcomputerComputer hardwareGeneral Environmental ScienceProcedia Computer Science
researchProduct