Search results for "Parallel computing"

showing 10 items of 189 documents

An Embedded, FPGA-based Computer Graphics Coprocessor with Native Geometric Algebra Support

2009

The representation of geometric objects and their transformation are the two key aspects in computer graphics applications. Traditionally, computer-intensive matrix calculations are involved in modeling and rendering three-dimensional (3D) scenery. Geometric algebra (aka Clifford algebra) is attracting attention as a natural way to model geometric facts and as a powerful analytical tool for symbolic calculations. In this paper, the architecture of Clifford coprocessor (CliffoSor) is introduced. CliffoSor is an embedded parallel coprocessing core that offers direct hardware support to Clifford algebra operators. A prototype implementation on a programmable gate array (FPGA) board is detailed…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSpeedupCoprocessorComputer scienceClifford algebraParallel computingRendering (computer graphics)Computer graphicsGeometric algebraHardware and ArchitectureComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATIONElectrical and Electronic EngineeringClifford algebra Computational geometry Embedded coprocessors Application-specific processor FPGA-based prototypingField-programmable gate arraySoftwareEuclidean vector

researchProduct

A Dual-Core Coprocessor with Native 4D Clifford Algebra Support

2012

Geometric or Clifford Algebra (CA) is a powerful mathematical tool that is attracting a growing attention in many research fields such as computer graphics, computer vision, robotics and medical imaging for its natural and intuitive way to represent geometric objects and their transformations. This paper introduces the architecture of CliffordCoreDuo, an embedded dual-core coprocessor that offers direct hardware support to four-dimensional (4D) Clifford algebra operations. A prototype implementation on an FPGA board is detailed. Experimental results show a 1.6× average speedup of CliffordCoreDuo in comparison with the baseline mono-core architecture. A potential cycle speedup of about 40× o…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSpeedupCoprocessorComputer sciencebusiness.industryClifford algebraParallel computingComputer graphicsGeometric algebraSoftwareClifford algebra embedded coprocessors multi-core architectures FPGA prototyping medical imagingField-programmable gate arraybusinessFPGA prototype2012 15th Euromicro Conference on Digital System Design

researchProduct

Accelerating Clifford Algebra Operations using GPUs and an OpenCL Code Generator

2015

Clifford Algebra (CA) is a powerful mathematical language that allows for a simple and intuitive representation of geometric objects and their transformations. It has important applications in many research fields, such as computer graphics, robotics, and machine vision. Direct hardware support of Clifford data types and operators is needed to accelerate applications based on Clifford Algebra. This paper proposes a mixed software-hardware system that exploits the computational power of Graphics Processing Units (GPUs) to accelerate Clifford operations. A code generator, namely OpenCLifford, is presented that automatically generates Java and C libraries for the direct support of Clifford ele…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSpeedupHardware-software co-designOpenCLComputer scienceClifford algebraGeometric AlgebraParallel computingData typeMetaprogrammingComputer graphicsClifford AlgebraGeometric algebraComputingMethodologies_SYMBOLICANDALGEBRAICMANIPULATIONCode generationCentral processing unitGraphicsGraphics Processing Unit

researchProduct

An Evolution of the Non-Parameter Harris Affine Corner Detector: A Distributed Approach

2009

A parallel version of a new automatic Harris-based corner detector is presented. A scheduler to dynamically and homogeneously distribute high computational workload on heterogeneous parallel architectures such as Grid systems has been implemented to speedup the whole procedure. Experimental results show the robustness of the underlying scheduler, which can be easily exploited in various automatic image analysis systems.

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSpeedupSettore INF/01 - InformaticaComputer scienceDetectorFeature extractionYarnParallel computingEdge detectionGrid AlgorithmCorner DetectorScheduling (computing)Robustness (computer science)Adaptive Schedulingvisual_artvisual_art.visual_art_mediumAffine transformationClient-server ParadigmComputer Science::Operating Systems2009 International Conference on Parallel and Distributed Computing, Applications and Technologies

researchProduct

Embedded Coprocessors for Native Execution of Geometric Algebra Operations

2016

Clifford algebra or geometric algebra (GA) is a simple and intuitive way to model geometric objects and their transformations. Operating in high-dimensional vector spaces with significant computational costs, the practical use of GA requires dedicated software and/or hardware architectures to directly support Clifford data types and operators. In this paper, a family of embedded coprocessors for the native execution of GA operations is presented. The paper shows the evolution of the coprocessor family focusing on the latest two architectures that offer direct hardware support to up to five-dimensional Clifford operations. The proposed coprocessors exploit hardware-oriented representations o…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniTheoretical computer scienceCoprocessorInverse kinematicsbusiness.industryApplied MathematicsClifford algebraGeometric algebra Embedded coprocessors Application-specific processors FPGA-based prototyping.02 engineering and technologyParallel computingData type020202 computer hardware & architectureGeometric algebraSoftware0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingField-programmable gate arraybusinessVector spaceMathematicsAdvances in Applied Clifford Algebras

researchProduct

Faster GPU-Accelerated Smith-Waterman Algorithm with Alignment Backtracking for Short DNA Sequences

2014

In this paper, we present a GPU-accelerated Smith-Waterman (SW) algorithm with Alignment Backtracking, called GSWAB, for short DNA sequences. This algorithm performs all-to-all pairwise alignments and retrieves optimal local alignments on CUDA-enabled GPUs. To facilitate fast alignment backtracking, we have investigated a tile-based SW implementation using the CUDA programming model. This tiled computing pattern enables us to more deeply explore the powerful compute capability of GPUs. We have evaluated the performance of GSWAB on a Kepler-based GeForce GTX Titan graphics card. The results show that GSWAB can achieve a performance of up to 56.8 GCUPS on large-scale datasets. Furthermore, ou…

Smith–Waterman algorithmCUDATitan (supercomputer)SpeedupComputer scienceBacktrackingParallel computingSoftware_PROGRAMMINGTECHNIQUESGraphicsDNA sequencingComputingMethodologies_COMPUTERGRAPHICS

researchProduct

SWAPHI: Smith-Waterman Protein Database Search on Xeon Phi Coprocessors

2014

The maximal sensitivity of the Smith-Waterman (SW) algorithm has enabled its wide use in biological sequence database search. Unfortunately, the high sensitivity comes at the expense of quadratic time complexity, which makes the algorithm computationally demanding for big databases. In this paper, we present SWAPHI, the first parallelized algorithm employing Xeon Phi coprocessors to accelerate SW protein database search. SWAPHI is designed based on the scale-and-vectorize approach, i.e. it boosts alignment speed by effectively utilizing both the coarse-grained parallelism from the many co-processing cores (scale) and the fine-grained parallelism from the 512-bit wide single instruction, mul…

Smith–Waterman algorithmFOS: Computer and information sciencesMulti-core processorCoprocessorSpeedupSequence databaseComputer scienceParallel computingIntrinsicsComputer Science - Distributed Parallel and Cluster ComputingScalabilitySIMDDistributed Parallel and Cluster Computing (cs.DC)Xeon Phi

researchProduct

GSWABE: faster GPU-accelerated sequence alignment with optimal alignment retrieval for short DNA sequences

2014

In this paper, we present GSWABE, a graphics processing unit GPU-accelerated pairwise sequence alignment algorithm for a collection of short DNA sequences. This algorithm supports all-to-all pairwise global, semi-global and local alignment, and retrieves optimal alignments on Compute Unified Device Architecture CUDA-enabled GPUs. All of the three alignment types are based on dynamic programming and share almost the same computational pattern. Thus, we have investigated a general tile-based approach to facilitating fast alignment by deeply exploring the powerful compute capability of CUDA-enabled GPUs. The performance of GSWABE has been evaluated on a Kepler-based Tesla K40 GPU using a varie…

Smith–Waterman algorithmSpeedupComputer Networks and CommunicationsComputer scienceSequence alignmentNeedleman–Wunsch algorithmParallel computingDNA sequencingComputer Science ApplicationsTheoretical Computer ScienceDynamic programmingCUDAComputational Theory and MathematicsSoftwareConcurrency and Computation: Practice and Experience

researchProduct

Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures

2015

In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Fur…

Smith–Waterman algorithmXeonComputer scienceData parallelismHyper-threadingSIMDParallel computingCentral processing unitComputerSystemsOrganization_PROCESSORARCHITECTURESIntrinsicsXeon Phi2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

researchProduct

Splitting the data cache: a survey

2000

Recent cache-memory research has focused on approaches that split the first-level data cache into two independent subcaches. The authors introduce a methodology for helping cache designers devise splitting schemes and survey a representative set of the published cache schemes.

Snoopy cacheHardware_MEMORYSTRUCTURESDatabaseCache coloringComputer scienceGeneral EngineeringParallel computingCache pollutioncomputer.software_genreSmart CacheCache invalidationPage cacheCachecomputerCache algorithmsIEEE Concurrency

researchProduct