0000000000180400

AUTHOR

Jan Westerholm

showing 6 related works from this author

Lattice Boltzmann Simulations at Petascale on Multi-GPU Systems with Asynchronous Data Transfer and Strictly Enforced Memory Read Alignment

2015

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently general-purpose graphics processing units have become accessible as high-performance computing resources at large-scale. We report on implementing a lattice Boltzmann solver for multi-GPU systems that achieves 0.69 PFLOPS performance on 16384 GPUs. In addition to optimizing the data layout on the GPUs and eliminating the halo sites, we make use of the possibility to overlap data transfer between the host CPU and the device GPU with computing on the GPU. We simulate flow in porous media and measure both strong and weak scaling performance with the emphasis being on a large scale…

ta113ta114Computer scienceLattice Boltzmann methodsGPUParallel computingSolverLattice Boltzmannmemory alignmentComputational sciencePetascale computingAsynchronous communicationData structure alignmentGraphicsasynchronous communicationTitanHost (network)ComputingMethodologies_COMPUTERGRAPHICSData transmissionEuromicro international conference on parallel, distributed and network-based processing
researchProduct

A prospect for computing in porous materials research: Very large fluid flow simulations

2016

Abstract Properties of porous materials, abundant both in nature and industry, have broad influences on societies via, e.g. oil recovery, erosion, and propagation of pollutants. The internal structure of many porous materials involves multiple scales which hinders research on the relation between structure and transport properties: typically laboratory experiments cannot distinguish contributions from individual scales while computer simulations cannot capture multiple scales due to limited capabilities. Thus the question arises how large domain sizes can in fact be simulated with modern computers. This question is here addressed using a realistic test case; it is demonstrated that current …

General Computer ScienceComputer scienceLattice Boltzmann method0208 environmental biotechnologyGPULattice Boltzmann methods02 engineering and technologyParallel computing01 natural sciencesPermeability010305 fluids & plasmasTheoretical Computer ScienceComputational sciencePorous materialPetascale computing0103 physical sciencesFluid dynamicsFluid flow simulationPorosityta113ta114Supercomputer020801 environmental engineeringAddressing modePermeability (earth sciences)Petascale computingModeling and SimulationPorous mediumJournal of Computational Science
researchProduct

An efficient swap algorithm for the lattice Boltzmann method

2007

During the last decade, the lattice-Boltzmann method (LBM) as a valuable tool in computational fluid dynamics has been increasingly acknowledged. The widespread application of LBM is partly due to the simplicity of its coding. The most well-known algorithms for the implementation of the standard lattice-Boltzmann equation (LBE) are the two-lattice and two-step algorithms. However, implementations of the two-lattice or the two-step algorithm suffer from high memory consumption or poor computational performance, respectively. Ultimately, the computing resources available decide which of the two disadvantages is more critical. Here we introduce a new algorithm, called the swap algorithm, for t…

Computer simulationComputer sciencebusiness.industryLattice Boltzmann methodsGeneral Physics and AstronomyComputational fluid dynamicsProgram optimizationNonlinear Sciences::Cellular Automata and Lattice GasesHigh memoryHardware and ArchitecturebusinessAlgorithmImplementationSwap (computer programming)Coding (social sciences)Computer Physics Communications
researchProduct

Designing a graphics processing unit accelerated petaflop capable lattice Boltzmann solver: Read aligned data layouts and asynchronous communication

2016

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently, general-purpose graphics processing units (GPUs) have become available as high-performance computing resources at large scale. We report on designing and implementing a lattice Boltzmann solver for multi-GPU systems that achieves 1.79 PFLOPS performance on 16,384 GPUs. To achieve this performance, we introduce a GPU compatible version of the so-called bundle data layout and eliminate the halo sites in order to improve data access alignment. Furthermore, we make use of the possibility to overlap data transfer between the host central processing unit and the device GPU with com…

virtauslaskentalarge-scale I/OComputer scienceGraphics processing unitLattice Boltzmann methodscomputational fluid dynamicsParallel computinggraphics processing unit01 natural sciencesmemory alignmentprocessors010305 fluids & plasmasTheoretical Computer Science0103 physical sciencesData structure alignment0101 mathematicsGraphicsComputingMethodologies_COMPUTERGRAPHICSta113data layoutta114prosessoritSolverLattice Boltzmann010101 applied mathematicsData accessHardware and ArchitectureAsynchronous communicationCentral processing unitasynchronous communicationTitanSoftwareThe International Journal of High Performance Computing Applications
researchProduct

Designing a graphics processing unit accelerated petaflop capable lattice Boltzmann solver: Read aligned data layouts and asynchronous communication

2017

The lattice Boltzmann method is a well-established numerical approach for complex fluid flow simulations. Recently, general-purpose graphics processing units (GPUs) have become available as high-performance computing resources at large scale. We report on designing and implementing a lattice Boltzmann solver for multi-GPU systems that achieves 1.79 PFLOPS performance on 16,384 GPUs. To achieve this performance, we introduce a GPU compatible version of the so-called bundle data layout and eliminate the halo sites in order to improve data access alignment. Furthermore, we make use of the possibility to overlap data transfer between the host central processing unit and the device GPU with comp…

load balancedata layoutlarge-scale I/Ovirtauslaskentaprosessoritasynchronous communicationgraphics processing unitTitanLattice Boltzmannmemory alignmentComputingMethodologies_COMPUTERGRAPHICS
researchProduct

Yläkouluikäisten nuorten fyysisen aktiivisuuden yhteydet tupakointiin ja alkoholinkäyttöön

2015

Lonka, Aleksi & Westerholm, Jan. 2015. Yläkouluikäisten nuorten fyysisen aktiivisuuden yhteydet tupakointiin ja alkoholinkäyttöön. Liikuntakasvatuksen laitos. Jyväskylän yliopisto. Liikuntapedagogiikan pro gradu -tutkielma. 86 s., 1 liite. Tutkimuksemme tarkoituksena oli selvittää yläkouluikäisten (8.-9. luokkalaiset) fyysisen aktiivisuuden yhteyttä tupakointiin ja alkoholinkäyttöön. Selvitimme tutkimuksessamme myös eroaako urheiluseuratoiminnassa säännöllisesti mukana olevien nuorten tupakointi ja alkoholinkäyttö muiden ikätoverien päihteidenkäyttötavoista. Lisäksi tarkastelimme ruutuajan yhteyttä tupakointiin ja alkoholinkäyttöön. Tuloksissa vertailtiin eroja sukupuolten ja luokka-asteide…

nuorettupakointiterveyskäyttäytyminenruutuaikaliikuntaalkoholinkäyttöfyysinen aktiivisuus
researchProduct