Search results for "cud"

showing 10 items of 74 documents

Massively Parallel ANS Decoding on GPUs

2019

In recent years, graphics processors have enabled significant advances in the fields of big data and streamed deep learning. In order to keep control of rapidly growing amounts of data and to achieve sufficient throughput rates, compression features are a key part of many applications including popular deep learning pipelines. However, as most of the respective APIs rely on CPU-based preprocessing for decoding, data decompression frequently becomes a bottleneck in accelerated compute systems. This establishes the need for efficient GPU-based solutions for decompression. Asymmetric numeral systems (ANS) represent a modern approach to entropy coding, combining superior compression results wit…

020203 distributed computingComputer science020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingCUDAScalability0202 electrical engineering electronic engineering information engineeringCodecSIMDEntropy encodingMassively parallelDecoding methodsData compressionProceedings of the 48th International Conference on Parallel Processing

researchProduct

Nvidia CUDA parallel processing of large FDTD meshes in a desktop computer

2020

The Finite Difference in Time Domain numerical (FDTD) method is a well know and mature technique in computational electrodynamics. Usually FDTD is used in the analysis of electromagnetic structures, and antennas. However still there is a high computational burden, which is a limitation for use in combination with optimization algorithms. The parallelization of FDTD to calculate in GPU is possible using Matlab and CUDA tools. For instance, the simulation of a planar array, with a three dimensional FDTD mesh 790x276x588, for 6200 time steps, takes one day -elapsed time- using the CPU of an Intel Core i3 at 2.4GHz in a personal computer, 8Gb RAM. This time is reduced 120 times when the calcula…

020203 distributed computingComputer scienceFinite-difference time-domain methodGraphics processing unit02 engineering and technologyComputational scienceCUDAPersonal computer0202 electrical engineering electronic engineering information engineeringComputational electromagnetics020201 artificial intelligence & image processingCentral processing unitTime domainMATLABcomputercomputer.programming_languageProceedings of the 10th Euro-American Conference on Telematics and Information Systems

researchProduct

Massively Parallel Huffman Decoding on GPUs

2018

Data compression is a fundamental building block in a wide range of applications. Besides its intended purpose to save valuable storage on hard disks, compression can be utilized to increase the effective bandwidth to attached storage as realized by state-of-the-art file systems. In the foreseeing future, on-the-fly compression and decompression will gain utmost importance for the processing of data-intensive applications such as streamed Deep Learning tasks or Next Generation Sequencing pipelines, which establishes the need for fast parallel implementations. Huffman coding is an integral part of a number of compression methods. However, efficient parallel implementation of Huffman decompre…

020203 distributed computingComputer sciencebusiness.industryDeep learning020206 networking & telecommunicationsData_CODINGANDINFORMATIONTHEORY02 engineering and technologyParallel computingHuffman codingsymbols.namesakeCUDATitan (supercomputer)0202 electrical engineering electronic engineering information engineeringsymbolsArtificial intelligencebusinessMassively parallelData compressionProceedings of the 47th International Conference on Parallel Processing

researchProduct

Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi

2016

Advanced SIMD features on GPUs and Xeon Phis promote efficient long pattern search.A tiled approach to accelerating the Wu-Manber algorithm on GPUs has been proposed.Both the GPU and Xeon Phi yield two orders-of-magnitude speedup over one CPU core.The GPU-based version with tiling runs up to 2.9 × faster than the Xeon Phi version. Approximate pattern matching (APM) targets to find the occurrences of a pattern inside a subject text allowing a limited number of errors. It has been widely used in many application areas such as bioinformatics and information retrieval. Bit-parallel APM takes advantage of the intrinsic parallelism of bitwise operations inside a machine word. This approach typica…

020203 distributed computingSpeedupCoprocessorXeonComputer Networks and CommunicationsComputer science02 engineering and technologyParallel computingSupercomputerComputer Graphics and Computer-Aided DesignTheoretical Computer ScienceCUDAArtificial IntelligenceHardware and Architecture0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingSIMDBitwise operationSoftwareWord (computer architecture)Xeon PhiParallel Computing

researchProduct

High quality conservative surface mesh generation for swept volumes

2012

We present a novel, efficient and flexible scheme to generate a high quality mesh that approximates the outer boundary of a swept volume. Our approach comes with two guarantees. First, the approximation is conservative, i.e., the swept volume is enclosed by the generated mesh. Second, the one-sided Hausdorff distance of the generated mesh to the swept volume is upper bounded by a user defined tolerance. Exploiting this tolerance the algorithm generates a mesh that is adapted to the local complexity of the swept volume boundary, keeping the overall output complexity remarkably low. The algorithm is two-phased: the actual sweep and the mesh generation. In the sweeping phase we introduce a gen…

0209 industrial biotechnologyComputer scienceParallel algorithmBoundary (topology)020207 software engineering02 engineering and technologyParallel computingComputational scienceCUDA020901 industrial engineering & automationMesh generation0202 electrical engineering electronic engineering information engineeringRuppert's algorithmComputingMethodologies_COMPUTERGRAPHICS2012 IEEE International Conference on Robotics and Automation

researchProduct

CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm

2015

Clustering of molecular systems according to their three-dimensional structure is an important step in many bioinformatics workflows. In applications such as docking or structure prediction, many algorithms initially generate large numbers of candidate poses (or decoys), which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates can easily range from thousands to millions, performing the clustering on standard central processing units (CPUs) is highly time consuming. In this paper, we analyse and evaluate different approaches to parallelize the nearest neighbour chain algorithm to perform hierarc…

0301 basic medicineSpeedupComputer scienceCorrelation clusteringParallel computingTheoretical Computer Science03 medical and health sciencesCUDA030104 developmental biologyHardware and ArchitectureCluster analysisAlgorithmSoftwareWard's methodThe International Journal of High Performance Computing Applications

researchProduct

Accelerating metagenomic read classification on CUDA-enabled GPUs.

2016

Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…

0301 basic medicineTheoretical computer scienceWorkstationGPUsComputer scienceContext (language use)CUDAParallel computingBiochemistryGenomelaw.invention03 medical and health sciencesCUDAUser-Computer Interface0302 clinical medicineStructural BiologylawTaxonomic assignmentHumansMicrobiomeMolecular BiologyInternetXeonApplied MathematicsHigh-Throughput Nucleotide SequencingSequence Analysis DNAExact k-mer matchingComputer Science Applications030104 developmental biologyTitan (supercomputer)Metagenomics030220 oncology & carcinogenesisMetagenomicsDNA microarraySoftwareBMC bioinformatics

researchProduct

GPU-Based Optimisation of 3D Sensor Placement Considering Redundancy, Range and Field of View

2020

This paper presents a novel and efficient solution for the 3D sensor placement problem based on GPU programming and massive parallelisation. Compared to prior art using gradient-search and mixed-integer based approaches, the method presented in this paper returns optimal or good results in a fraction of the time compared to previous approaches. The presented method allows for redundancy, i.e. requiring selected sub-volumes to be covered by at least n sensors. The presented results are for 3D sensors which have a visible volume represented by cones, but the method can easily be extended to work with sensors having other range and field of view shapes, such as 2D cameras and lidars.

0303 health sciences030306 microbiologyComputer scienceVolume (computing)020207 software engineeringField of view02 engineering and technology3d sensor03 medical and health sciencesRange (mathematics)CUDAComputer engineering0202 electrical engineering electronic engineering information engineeringRedundancy (engineering)Fraction (mathematics)General-purpose computing on graphics processing units2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA)

researchProduct

Contribution to the knowledge of the mosquitoes in the Devesa of Racó de l Olla, Albufera Natural Park of Valencia (Spain)

2017

[ES] Se presenta una recopilación de los resultados de varios proyectos, desarrollados entre los años 2004 y 2015, sobre la presencia de los mosquitos de la familia Culicidae en el entorno de la Devesa y el Racó de l¿Olla del Parque Natural de l¿Albufera de Valencia (España). Se registró un total de 10 especies pertenecientes a cinco géneros (Aedes, Coquillettidia, Culex, Culiseta y Ochlerotatus), alguno de estos muy característicos de los ambientes donde fueron recolectados. Se tratan diferentes aspectos sobre la diversidad de culícidos, así como del interés ecológico y sanitario de la presencia de los mismos. Destacamos la existencia de la especie Aedes albopictus (Skuse, 1894) en un espa…

AedesDevesaAedes albopictusbiologyCulexEspañaRacó de l Ollabiology.organism_classificationCoquillettidiaMosquitoesIndustrial and Manufacturing EngineeringZancudosAlbuferaGeographyECOLOGIASpainNatural parkMosquitosValenciaOchlerotatusCulisetaHumanitiesTECNOLOGIA DEL MEDIO AMBIENTEFamily Culicidae

researchProduct

El proyecto mapa escolar de Valencia: Análisis de la zonificación educativa de la ciudad de Valencia

2018

The research project Mapa Escolar de Valencia (School Map of Valencia) was born out of an agreement between the City Council and the University of Valencia in order to carry out an investigation of the compulsory education system of the city and propose, if necessary, modifications to the current school zoning. The project is structured in several research areas. An analysis of the specialized scientific literature and public policies concerning education and schooling has been done, and it is currently analysing the evolution of quantitative and qualitative data on compulsory schooling in Valencia, its school zoning, the representations of education and the school climate in the city schoo…

Análisis de la zonificación educativa de la ciudad de Valencia de Madaria Escudero [1137-7038 8537 Arxius de sociologia 514142 2018 39 6874491 El proyecto mapa escolar de Valencia]zonificación escolardezoningMuñoz Rodríguezits school zoningVila LladosaValencia 129 142the representations of education and the school climate in the city schools. The main results of the research are presented in this article Mapa escolarschool zoningUNESCO::SOCIOLOGÍAmodifications to the current school zoning. The project is structured in several research areas. An analysis of the specialized scientific literature and public policies concerning education and schooling has been doneGabaldón Estevanschool social segregation1137-7038 8537 Arxius de sociologia 514142 2018 39 6874491 El proyecto mapa escolar de Valencia: Análisis de la zonificación educativa de la ciudad de Valencia de Madaria EscuderoRequena i MoraSchoolingRodríguez VictorianoSandraJosé Manuel The research project Mapa Escolar de Valencia (School Map of Valencia) was born out of an agreement between the City Council and the University of Valencia in order to carry out an investigation of the compulsory education system of the city and proposeDanielsegregación escolareducation equityLuisescolarización:SOCIOLOGÍA [UNESCO]García De FezMarinaDavidequidad educativaif necessarydistrito únicoBorjaValenciaand it is currently analysing the evolution of quantitative and qualitative data on compulsory schooling in Valencia

researchProduct