Search results for "GPU"

showing 10 items of 43 documents

Kriptovalūtu rakšana, tās risinājumi un realizācija mājas apstākļos

2019

Bakalaura darba mērķis ir pilnveidot zināšanas par kriptovalūtu rakšanu, to veidiem, saistītajām izmaksām, iesaistītajiem dalībniekiem, ietekmi uz vidi, viedokļus par tām. Lai noteiktu efektīvākās rakšanas iespējas, darba autors portatīvajā datorā ieviesa kriptovalūtu rakšanu un veica pētījumu par iegūtajiem datiem. Darbā tiek apskatīta kriptovalūtu rakšanas iespējas ar videokaršu, centrālā procesora, ASIC un FPGA iekārtu palīdzību. Tiek izvērtētas to priekšrocības un trūkumi, kurus izmanto reālos apstākļos, kurus vairs nē, kā arī iedalījums pēc iesaistīto dalībnieku skaita. Uzskaitīti galvenie izdevumi ar kuriem ir jārēķinās apsverot domu nodarboties ar rakšanu. Darba rezultātā tika izvērt…

DatorzinātneASICkriptovalūtaGPUkriptovalūtu rakšanaCPU

researchProduct

3D Sensor-Based Obstacle Detection Comparing Octrees and Point clouds Using CUDA

2012

This paper presents adaptable methods for achieving fast collision detection using the GPU and Nvidia CUDA together with Octrees. Earlier related work have focused on serial methods, while this paper presents a parallel solution which shows that there is a great increase in time if the number of operations is large. Two different models of the environment and the industrial robot are presented, the first is Octrees at different resolutions, the second is a point cloud representation. The relative merits of the two different world model representations are shown. In particular, the experimental results show the potential of adapting the resolution of the robot and environment models to the t…

Collision DetectionGPUIndustrial Robotlcsh:Electronic computers. Computer sciencelcsh:QA75.5-76.95Modeling, Identification and Control

researchProduct

GPU accelerated Monte Carlo simulations of lattice spin models

2011

We consider Monte Carlo simulations of classical spin models of statistical mechanics using the massively parallel architecture provided by graphics processing units (GPUs). We discuss simulations of models with discrete and continuous variables, and using an array of algorithms ranging from single-spin flip Metropolis updates over cluster algorithms to multicanonical and Wang-Landau techniques to judge the scope and limitations of GPU accelerated computation in this field. For most simulations discussed, we find significant speed-ups by two to three orders of magnitude as compared to single-threaded CPU implementations.

cluster algorithmsStatistical Mechanics (cond-mat.stat-mech)Computer scienceComputationNumerical analysisspin modelsMonte Carlo methodHigh Energy Physics - Lattice (hep-lat)FOS: Physical sciencesStatistical mechanicsGPU computingPhysics and Astronomy(all)Computational Physics (physics.comp-ph)generalized-ensemble simulationsMonte Carlo simulationsComputational scienceCUDAHigh Energy Physics - LatticeSpin modelGeneral-purpose computing on graphics processing unitsGraphicsPhysics - Computational PhysicsCondensed Matter - Statistical Mechanics

researchProduct

Future Processor Hardware Architectures for the Benefit of Precise Particle Accelerator Modeling

2017

Jaunās procesoru arhitektūras, kā grafiskie procesori (GPU) un Intel Many Integrated Cores (MIC) procesori, sniedz milzīgu veiktspējas potenciālu augstas veiktspējas skaitļošanas aplikācijās. Tomēr izstrādājot programmatūru, kas spēj izmantot šīs jaunās tehnoloģijas ir jāsaskarās ar dažādiem papildus grūtībām. Programmām ir jāspēj izmantot papildus paralēlisms, ko piedāvā šīs iekārtās, tām ir jāspēj pielāgoties dažādām procesoru arhitektūrām un jāizmanto dažādas izstrādes platformas, lai aplikācija spēdu darboties uz iekārtām no dažādiem ražotājiem. Dynamic Kernel Scheduler (DKS) tika izstrādāts, lai nodrošinātu papildus programmatūras slāni starp programmu un papildus processoriem. DKS nod…

DatorzinātnesGPU computingComputer science

researchProduct

Lattice Boltzmann Simulations at Petascale on Multi-GPU Systems with Asynchronous Data Transfer and Strictly Enforced Memory Read Alignment

2015

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently general-purpose graphics processing units have become accessible as high-performance computing resources at large-scale. We report on implementing a lattice Boltzmann solver for multi-GPU systems that achieves 0.69 PFLOPS performance on 16384 GPUs. In addition to optimizing the data layout on the GPUs and eliminating the halo sites, we make use of the possibility to overlap data transfer between the host CPU and the device GPU with computing on the GPU. We simulate flow in porous media and measure both strong and weak scaling performance with the emphasis being on a large scale…

ta113ta114Computer scienceLattice Boltzmann methodsGPUParallel computingSolverLattice Boltzmannmemory alignmentComputational sciencePetascale computingAsynchronous communicationData structure alignmentGraphicsasynchronous communicationTitanHost (network)ComputingMethodologies_COMPUTERGRAPHICSData transmissionEuromicro international conference on parallel, distributed and network-based processing

researchProduct

Fourth Workshop on using Emerging Parallel Architectures

2012

AbstractThe Fourth Workshop on Using Emerging Parallel Architectures (WEPA), held in conjunction with ICCS 2012, provides a forum for exploring the capabilities of emerging parallel architectures such as GPUs, FPGAs, Cell B.E., Intel M.I.C. and multicores to accelerate computational science applications.

OpenCLGPGPUHeterogeneous Multi-coresReconfigurable ComputingHigh Performance ComputingGeneral Earth and Planetary SciencesCUDAComputational ScienceParallel Computer ArchitecturesGeneral Environmental ScienceProcedia Computer Science

researchProduct

Accelerating metagenomic read classification on CUDA-enabled GPUs.

2016

Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…

0301 basic medicineTheoretical computer scienceWorkstationGPUsComputer scienceContext (language use)CUDAParallel computingBiochemistryGenomelaw.invention03 medical and health sciencesCUDAUser-Computer Interface0302 clinical medicineStructural BiologylawTaxonomic assignmentHumansMicrobiomeMolecular BiologyInternetXeonApplied MathematicsHigh-Throughput Nucleotide SequencingSequence Analysis DNAExact k-mer matchingComputer Science Applications030104 developmental biologyTitan (supercomputer)Metagenomics030220 oncology & carcinogenesisMetagenomicsDNA microarraySoftwareBMC bioinformatics

researchProduct

A prospect for computing in porous materials research: Very large fluid flow simulations

2016

Abstract Properties of porous materials, abundant both in nature and industry, have broad influences on societies via, e.g. oil recovery, erosion, and propagation of pollutants. The internal structure of many porous materials involves multiple scales which hinders research on the relation between structure and transport properties: typically laboratory experiments cannot distinguish contributions from individual scales while computer simulations cannot capture multiple scales due to limited capabilities. Thus the question arises how large domain sizes can in fact be simulated with modern computers. This question is here addressed using a realistic test case; it is demonstrated that current …

General Computer ScienceComputer scienceLattice Boltzmann method0208 environmental biotechnologyGPULattice Boltzmann methods02 engineering and technologyParallel computing01 natural sciencesPermeability010305 fluids & plasmasTheoretical Computer ScienceComputational sciencePorous materialPetascale computing0103 physical sciencesFluid dynamicsFluid flow simulationPorosityta113ta114Supercomputer020801 environmental engineeringAddressing modePermeability (earth sciences)Petascale computingModeling and SimulationPorous mediumJournal of Computational Science

researchProduct

Yleinen laskenta grafiikkasuorittimilla

2012

Esitellään nykyaikaisten grafiikkasuorittimien rakennetta, toimintaperiaatteita ja tutkitaan OpenCL:ää keinona käyttää niiden laskentakykyä yleisempään laskentaan. Toteutetaan osa JPEG-kuvanpakkausalgoritmia grafiikkasuorittimella OpenCL:n avulla.

OpenCLJPEGGPGPUgrafiikkasuoritin

researchProduct

Image processing applications in object detection and graph matching : from Matlab development to GPU framework

2020

Automatically finding correspondences between object features in images is of main interest for several applications, as object detection and tracking, flow velocity estimation, identification, registration, and many derived tasks. In this thesis, we address feature correspondence within the general framework of graph matching optimization and with the principal aim to contribute, at a final step, to the design of new and parallel algorithms and their implementation on GPU (Graphics Processing Unit) systems. Graph matching problems can have many declinations, depending on the assumptions of the application at hand. We observed a gap between applications based on local cost objective functio…

OptimizationLa détection d’objet[INFO.INFO-CV] Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]Image processingDistributed local searchGpu[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]L’appariement de grapheOptimisationGraph matchingObject trackingTraitement d'imageRecherche locale distribuée

researchProduct