6533b85bfe1ef96bd12bbe8c
RESEARCH PRODUCT
On solving separable block tridiagonal linear systems using a GPU implementation of radix-4 PSCR method
Tuomo RossiMirko MyllykoskiMirko MyllykoskiJari ToivanenJari Toivanensubject
Tridiagonal linear systemsProgramvaruteknikComputer Networks and CommunicationsComputer sciencePartial solution techniquereduction010103 numerical & computational mathematicsParallel computingtietotekniikka01 natural scienceslineaariset mallitTheoretical Computer ScienceSeparable spaceinformation technologyArtificial IntelligenceSeparable block tridiagonal linear systemBlock (telecommunications)Fast direct solverRadix0101 mathematicsta113Computer Sciencesta111Linear systemSoftware EngineeringGPU computingSolverComputer Science::Numerical Analysis010101 applied mathematicsPSCR methodDatavetenskap (datalogi)partial solution techniqueHardware and ArchitectureComputer Science::Mathematical Softwarepienennyslinear modelsSoftwareRoofline modelCyclic reductiondescription
Partial solution variant of the cyclic reduction (PSCR) method is a direct solver that can be applied to certain types of separable block tridiagonal linear systems. Such linear systems arise, e.g., from the Poisson and the Helmholtz equations discretized with bilinear finite-elements. Furthermore, the separability of the linear system entails that the discretization domain has to be rectangular and the discretization mesh orthogonal. A generalized graphics processing unit (GPU) implementation of the PSCR method is presented. The numerical results indicate up to 24-fold speedups when compared to an equivalent CPU implementation that utilizes a single CPU core. Attained floating point performance is analyzed using roofline performance analysis model and the resulting models show that the attained floating point performance is mainly limited by the off-chip memory bandwidth and the effectiveness of a tridiagonal solver used to solve arising tridiagonal subproblems. The performance is accelerated using off-line autotuning techniques. peerReviewed
year | journal | country | edition | language |
---|---|---|---|---|
2018-01-01 | Journal of Parallel and Distributed Computing |