Search results for "GPU"

showing 3 items of 43 documents

GPGPU-säteenseuranta

2012

Säteenseuranta on rinnakkaistuva ja laskentaintensiivinen tapa tuottaa kolmiulotteista tietokonegrafiikkaa. Yleiskäyttöiset grafiikkaprosessorit (GPGPU) ovat tehokkaita rinnakkaislaskentaprosessoreita, joiden avulla voidaan kiihdyttää säteenseurantaa. Tässä tutkielmassa käsitellään säteenseurannan toteuttamista yleiskäyttöisillä grafiikkaprosessoreilla ja esitetään rakenne yksinkertaiselle GPGPU-säteenseurantaohjelmalle. Käsittelyn aiheena ovat myös säteenseurantaa kiihdyttävien menetelmien, kuten kiihdytysrakenteiden, toteuttaminen GPGPU-laskennalla. Ray tracing is a parallel and computationally intensive way of producing three dimensional computer graphics. General-purpose graphics proces…

ray tracingGPGPUGPUsäteenseurantaraytracing
researchProduct

Lattice Boltzmann Simulations at Petascale on Multi-GPU Systems with Asynchronous Data Transfer and Strictly Enforced Memory Read Alignment

2015

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently general-purpose graphics processing units have become accessible as high-performance computing resources at large-scale. We report on implementing a lattice Boltzmann solver for multi-GPU systems that achieves 0.69 PFLOPS performance on 16384 GPUs. In addition to optimizing the data layout on the GPUs and eliminating the halo sites, we make use of the possibility to overlap data transfer between the host CPU and the device GPU with computing on the GPU. We simulate flow in porous media and measure both strong and weak scaling performance with the emphasis being on a large scale…

ta113ta114Computer scienceLattice Boltzmann methodsGPUParallel computingSolverLattice Boltzmannmemory alignmentComputational sciencePetascale computingAsynchronous communicationData structure alignmentGraphicsasynchronous communicationTitanHost (network)ComputingMethodologies_COMPUTERGRAPHICSData transmissionEuromicro international conference on parallel, distributed and network-based processing
researchProduct

Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units

2021

We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi-platform, in the sense that the realizations for (general-purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element-wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as In…

workload balancingMulti-core processorComputer Networks and CommunicationsComputer sciencesparse matrix-vector productParallel computingLoad balancing (computing)coordinate sparse matrix formatSparse matrix vectorcompressionExascale computingComputer Science ApplicationsTheoretical Computer ScienceComputational Theory and MathematicsCompression (functional analysis)Product (mathematics)Graphicsgraphics processing units (GPUs)multicoreprocessors (CPUs)Software
researchProduct