Search results for "OpenCL"
showing 8 items of 8 documents
Fourth Workshop on using Emerging Parallel Architectures
2012
AbstractThe Fourth Workshop on Using Emerging Parallel Architectures (WEPA), held in conjunction with ICCS 2012, provides a forum for exploring the capabilities of emerging parallel architectures such as GPUs, FPGAs, Cell B.E., Intel M.I.C. and multicores to accelerate computational science applications.
Accelerating Clifford Algebra Operations using GPUs and an OpenCL Code Generator
2015
Clifford Algebra (CA) is a powerful mathematical language that allows for a simple and intuitive representation of geometric objects and their transformations. It has important applications in many research fields, such as computer graphics, robotics, and machine vision. Direct hardware support of Clifford data types and operators is needed to accelerate applications based on Clifford Algebra. This paper proposes a mixed software-hardware system that exploits the computational power of Graphics Processing Units (GPUs) to accelerate Clifford operations. A code generator, namely OpenCLifford, is presented that automatically generates Java and C libraries for the direct support of Clifford ele…
Yleinen laskenta grafiikkasuorittimilla
2012
Esitellään nykyaikaisten grafiikkasuorittimien rakennetta, toimintaperiaatteita ja tutkitaan OpenCL:ää keinona käyttää niiden laskentakykyä yleisempään laskentaan. Toteutetaan osa JPEG-kuvanpakkausalgoritmia grafiikkasuorittimella OpenCL:n avulla.
Design exploration of aes accelerators on FPGAS and GPUs
2017
The embedded systems are increasingly becoming a key technological component of all kinds of complex tech-nical systems and an exhaustive analysis of the state of the art of all current performance with respect to architectures, design methodologies, test and applications could be very in-teresting. The Advanced Encryption Standard (AES), based on the well-known algorithm Rijndael, is designed to be easily implemented in hardware and software platforms. General purpose computing on graphics processing unit (GPGPU) is an alternative to recongurable accelerators based on FPGA devices. This paper presents a direct comparison between FPGA and GPU used as accelerators for the AES cipher. The res…
A GPU-accelerated augmented Lagrangian based L1-mean curvature Image denoising algorithm implementation
2015
This paper presents a graphics processing unit (GPU) implementation of a recently published augmented Lagrangian based L1-mean curvature image denoising algorithm. The algorithm uses a particular alternating direction method of multipliers to reduce the related saddle-point problem to an iterative sequence of four simpler minimization problems. Two of these subproblems do not contain the derivatives of the unknown variables and can therefore be solved point-wise without inter-process communication. Inparticular, this facilitates the efficient solution of the subproblem that deals with the non-convex term in the original objective function by modern GPUs. The two remaining subproblems are so…
On GPU-accelerated fast direct solvers and their applications in image denoising
2015
Fast Poisson solvers for graphics processing units
2013
Two block cyclic reduction linear system solvers are considered and implemented using the OpenCL framework. The topics of interest include a simplified scalar cyclic reduction tridiagonal system solver and the impact of increasing the radix-number of the algorithm. Both implementations are tested for the Poisson problem in two and three dimensions, using a Nvidia GTX 580 series GPU and double precision floating-point arithmetic. The numerical results indicate up to 6-fold speed increase in the case of the two-dimensional problems and up to 3- fold speed increase in the case of the three-dimensional problems when compared to equivalent CPU implementations run on a Intel Core i7 quad-core CPU…
Perfect Hashing Structures for Parallel Similarity Searches
2015
International audience; Seed-based heuristics have proved to be efficient for studying similarity between genetic databases with billions of base pairs. This paper focuses on algorithms and data structures for the filtering phase in seed-based heuristics, with an emphasis on efficient parallel GPU/manycores implementa- tion. We propose a 2-stage index structure which is based on neighborhood indexing and perfect hashing techniques. This structure performs a filtering phase over the neighborhood regions around the seeds in constant time and avoid as much as possible random memory accesses and branch divergences. Moreover, it fits particularly well on parallel SIMD processors, because it requ…