Search results for "GPUs"
showing 4 items of 4 documents
Accelerating metagenomic read classification on CUDA-enabled GPUs.
2016
Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…
First Experiences on an Accurate SPH Method on GPUs
2017
It is well known that the standard formulation of the Smoothed Particle Hydrodynamics is usually poor when scattered data distribution is considered or when the approximation near the boundary occurs. Moreover, the method is computational demanding when a high number of data sites and evaluation points are employed. In this paper an enhanced version of the method is proposed improving the accuracy and the efficiency by using a HPC environment. Our implementation exploits the processing power of GPUs for the basic computational kernel resolution. The performance gain demonstrates the method to be accurate and suitable to deal with large sets of data.
Advanced numerical treatment of an accurate SPH method
2019
The summation of Gaussian kernel functions is an expensive operation frequently encountered in scientific simulation algorithms and several methods have been already proposed to reduce its computational cost. In this work, the Improved Fast Gauss Transform (IFGT) [1] is properly applied to the Smoothed Particle Hydrodynamics (SPH) method [2] in order to speed up its efficiency. A modified version of the SPH method is considered in order to overcome the loss of accuracy of the standard formulation [3]. A suitable use of the IFGT allows us to reduce the computational effort while tuning the desired accuracy into the SPH framework. This technique, coupled with an algorithmic design for exploit…
Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units
2021
We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi-platform, in the sense that the realizations for (general-purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element-wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as In…