Search results for "GPU"

showing 10 items of 43 documents

GSaaS: A Service to Cloudify and Schedule GPUs

2018

Cloud technology is an attractive infrastructure solution that provides customers with an almost unlimited on-demand computational capacity using a pay-per-use approach, and allows data centers to increase their energy and economic savings by adopting a virtualized resource sharing model. However, resources such as graphics processing units (GPUs), have not been fully adapted to this model. Although, general-purpose computing on graphics processing units (GPGPU) is becoming more and more popular, cloud providers lack of flexibility to manage accelerators, because of the extended use of peripheral component interconnect (PCI) passthrough techniques to attach GPUs to virtual machines (VMs). F…

0301 basic medicineScheduleGeneral Computer ScienceComputer scienceDistributed computingnetworkingCloud computing02 engineering and technologycomputer.software_genre03 medical and health sciencesGPU resource management020204 information systems0202 electrical engineering electronic engineering information engineeringCloud computingGeneral Materials ScienceResource managementplatform virtualizationbusiness.industrycloud computingGeneral EngineeringVirtualizationShared resource030104 developmental biologyVirtual machineScalabilityGPU cloudificationlcsh:Electrical engineering. Electronics. Nuclear engineeringGeneral-purpose computing on graphics processing unitsbusinesscomputerlcsh:TK1-9971IEEE Access

researchProduct

Accelerating metagenomic read classification on CUDA-enabled GPUs.

2016

Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…

0301 basic medicineTheoretical computer scienceWorkstationGPUsComputer scienceContext (language use)CUDAParallel computingBiochemistryGenomelaw.invention03 medical and health sciencesCUDAUser-Computer Interface0302 clinical medicineStructural BiologylawTaxonomic assignmentHumansMicrobiomeMolecular BiologyInternetXeonApplied MathematicsHigh-Throughput Nucleotide SequencingSequence Analysis DNAExact k-mer matchingComputer Science Applications030104 developmental biologyTitan (supercomputer)Metagenomics030220 oncology & carcinogenesisMetagenomicsDNA microarraySoftwareBMC bioinformatics

researchProduct

Design exploration of aes accelerators on FPGAS and GPUs

2017

The embedded systems are increasingly becoming a key technological component of all kinds of complex tech-nical systems and an exhaustive analysis of the state of the art of all current performance with respect to architectures, design methodologies, test and applications could be very in-teresting. The Advanced Encryption Standard (AES), based on the well-known algorithm Rijndael, is designed to be easily implemented in hardware and software platforms. General purpose computing on graphics processing unit (GPGPU) is an alternative to recongurable accelerators based on FPGA devices. This paper presents a direct comparison between FPGA and GPU used as accelerators for the AES cipher. The res…

AESOpenCLGPGPUAcceleratorFPGA prototyping

researchProduct

Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems

2012

SUMMARY The block Wiedemann (BW) algorithm is frequently used to solve sparse linear systems over GF(2). Iterative sparse matrix–vector multiplication is the most time-consuming operation. The necessity to accelerate this step is motivated by the application of BW to very large matrices used in the linear algebra step of the number field sieve (NFS) for integer factorization. In this paper, we derive an efficient CUDA implementation of this operation by using a newly designed hybrid sparse matrix format. This leads to speedups between 4 and 8 on a single graphics processing unit (GPU) for a number of tested NFS matrices compared with an optimized multicore implementation. We further present…

Block Wiedemann algorithmComputer Networks and CommunicationsComputer scienceGraphics processing unitSparse matrix-vector multiplicationGPU clusterParallel computingGF(2)Computer Science ApplicationsTheoretical Computer ScienceGeneral number field sieveMatrix (mathematics)Computational Theory and MathematicsFactorizationLinear algebraMultiplicationComputer Science::Operating SystemsSoftwareInteger factorizationSparse matrixConcurrency and Computation: Practice and Experience

researchProduct

3D Sensor-Based Obstacle Detection Comparing Octrees and Point clouds Using CUDA

2012

This paper presents adaptable methods for achieving fast collision detection using the GPU and Nvidia CUDA together with Octrees. Earlier related work have focused on serial methods, while this paper presents a parallel solution which shows that there is a great increase in time if the number of operations is large. Two different models of the environment and the industrial robot are presented, the first is Octrees at different resolutions, the second is a point cloud representation. The relative merits of the two different world model representations are shown. In particular, the experimental results show the potential of adapting the resolution of the robot and environment models to the t…

Collision DetectionGPUIndustrial Robotlcsh:Electronic computers. Computer sciencelcsh:QA75.5-76.95Modeling, Identification and Control

researchProduct

Three-dimensional Fuzzy Kernel Regression framework for registration of medical volume data

2013

Abstract In this work a general framework for non-rigid 3D medical image registration is presented. It relies on two pattern recognition techniques: kernel regression and fuzzy c-means clustering. The paper provides theoretic explanation, details the framework, and illustrates its application to implement three registration algorithms for CT/MR volumes as well as single 2D slices. The first two algorithms are landmark-based approaches, while the third one is an area-based technique. The last approach is based on iterative hierarchical volume subdivision, and maximization of mutual information. Moreover, a high performance Nvidia CUDA based implementation of the algorithm is presented. The f…

Computer sciencebusiness.industryImage registrationMutual informationMachine learningcomputer.software_genreFuzzy logicCUDANon-rigid registration Fuzzy regression Mutual information Interpolation GPU computingArtificial IntelligenceSignal ProcessingPattern recognition (psychology)Kernel regressionComputer Vision and Pattern RecognitionArtificial intelligenceData miningGeneral-purpose computing on graphics processing unitsCluster analysisbusinesscomputerSoftwareInterpolationPattern Recognition

researchProduct

A Parallel Approach to HRTF Approximation and Interpolation Based on a Parametric Filter Model

2017

[EN] Spatial audio-rendering techniques using head-related transfer functions (HRTFs) are currently used in many different contexts such as immersive teleconferencing systems, gaming, or 3-D audio reproduction. Since all these applications usually involve real-time constraints, efficient processing structures for HRTF modeling and interpolation are necessary for providing real-time binaural audio solutions. This letter presents a parametric parallel model that allows us to perform HRTF filtering and interpolation efficiently from an input HRTF dataset. The resulting model, which is an adaptation from a recently proposed modeling technique, not only reduces the size of HRTF datasets signific…

Computer scienceparallel filters02 engineering and technologySolid modelingbinaural synthesisTransfer functionTECNOLOGIA ELECTRONICA030507 speech-language pathology & audiology03 medical and health sciencesgraphic processing unit (GPU)0202 electrical engineering electronic engineering information engineeringCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALhead-related transfer function (HRTF) modelingComputer visionElectrical and Electronic EngineeringAdaptation (computer science)Parametric statisticsbusiness.industryApplied MathematicsTeleconferenceBinaural synthesis020206 networking & telecommunicationsFilter (signal processing)interpolationInterpolationGraphic processing unit (GPU)Signal ProcessingHead-related transfer function (HRTF) modelingParallel filtersArtificial intelligence0305 other medical sciencebusinessAlgorithmInterpolation

researchProduct

Kriptovalūtu rakšana, tās risinājumi un realizācija mājas apstākļos

2019

Bakalaura darba mērķis ir pilnveidot zināšanas par kriptovalūtu rakšanu, to veidiem, saistītajām izmaksām, iesaistītajiem dalībniekiem, ietekmi uz vidi, viedokļus par tām. Lai noteiktu efektīvākās rakšanas iespējas, darba autors portatīvajā datorā ieviesa kriptovalūtu rakšanu un veica pētījumu par iegūtajiem datiem. Darbā tiek apskatīta kriptovalūtu rakšanas iespējas ar videokaršu, centrālā procesora, ASIC un FPGA iekārtu palīdzību. Tiek izvērtētas to priekšrocības un trūkumi, kurus izmanto reālos apstākļos, kurus vairs nē, kā arī iedalījums pēc iesaistīto dalībnieku skaita. Uzskaitīti galvenie izdevumi ar kuriem ir jārēķinās apsverot domu nodarboties ar rakšanu. Darba rezultātā tika izvērt…

DatorzinātneASICkriptovalūtaGPUkriptovalūtu rakšanaCPU

researchProduct

Resursu koplietošanas metodes infrastruktūras mākoņos

2017

Lai gan infrastruktūras mākoņi lielā mērā ir balstīti uz resursu virtualizāciju, koplietošanu, un to izdalīšanu klientam, ir ļoti daudz resursu, kuru lietošana mākoņos vēl nav līdz galam popularizēta, skaidra un attīstīta. Šo resursu starpā ir grafikas procesori un citas PCI Express ierīces, kā arī USB ierīces. Bakalaura darba mērķis ir apzināt šo resursu koplietošanas metodes, novērtēt to lietojamību, pieejamību un sarežģītību, un salīdzināt tās savā starpā, ja tas iespējams, kā arī aprakstīt tās resursu koplietošanas metodes, kas vēl ir tikai izstrādes stadijā. Darba rezultātā ir iegūta padziļināta izpratne par infrastruktūras mākoņiem un dažādu datorresursu koplietošanu, ir izpētītas un …

DatorzinātneGPUresursu koplietošanainfrastruktūras mākonisvirtualizācijaPCI Express ierīces

researchProduct

3D grafikas atveidošana reālā laikā ar Vulkan API

2017

Maģistra darba mērķis veikt pētījumu par dažādām 3D grafikas atveidošanas iespējām uz mūsdienu datoriem, fokusējoties uz Vulkan programmsaskarni (turpmāk API), apskatot tās stiprās un vājas puses kā arī atšķirības no alternatīvām. Līdz šim tirgu ir dominējušas 2 API, lai izmantotu videokartes grafisko procesoru (turpmāk GPU) grafikas atveidošanai – OpenGL un DirectX. Tā kā mūsdienās 3D grafikas detalizācija un sarežģītība ir augoša, ir nepieciešami jauni risinājumi, kas precīzāk atbilst mūsdienu datoru arhitektūrai un ierīču daudzveidībai, kā arī dotu izstrādātājiem lielāku kontroli pār visiem notiekošajiem aprēķinu procesiem. Darba ietvaros ir izstrādāti divi 3D grafikas dzinēji. Vienā ir …

DatorzinātneVulkan APIOpenGL3D graphicsGPU computingRendering

researchProduct