Search results for "Parallel"

showing 10 items of 667 documents

Big Data in metagenomics: Apache Spark vs MPI.

2020

The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data technologies to process this large amount of information in distributed memory clusters of commodity hardware. Several approaches based on solutions such as Apache Hadoop or Apache Spark, have been proposed. These solutions allow developers to focus on the problem while the need to deal with low level details, such as data distribution schemes or communication patterns among processing nodes, can be ignored. However, performance and scalability are also of high importance when…

Big DataComputer and Information SciencesScienceBig dataMessage Passing InterfaceParallel computingResearch and Analysis MethodsComputing MethodologiesComputing MethodologiesComputer ArchitectureComputer SoftwareDatabase and Informatics MethodsSoftwareSpark (mathematics)GeneticsMammalian GenomicsMultidisciplinarybusiness.industryApplied MathematicsSimulation and ModelingQRBiology and Life SciencesComputational BiologySoftware EngineeringGenomicsDNAGenomic DatabasesGenome AnalysisComputer HardwareSupercomputerBiological DatabasesAnimal GenomicsPhysical SciencesScalabilityEngineering and TechnologyMetagenomeMedicineDistributed memoryMetagenomicsbusinessMathematicsAlgorithmsGenome BacterialSoftwareResearch ArticlePLoS ONE

researchProduct

Bayesian hierarchical models for analysing the spatial distribution of bioclimatic indices

2017

A methodological approach for modelling the spatial distribution of bioclimatic indices is proposed in this paper. The value of the bioclimatic index is modelled with a hierarchical Bayesian model that incorporates both structured and unstructured random effects. Selection of prior distributions is also discussed in order to better incorporate any possible prior knowledge about the parameters that could refer to the particular characteristics of bioclimatic indices. MCMC methods and distributed programming are used to obtain an approximation of the posterior distribution of the parameters and also the posterior predictive distribution of the indices. One main outcome of the proposal is the …

Bioclimatologia:62 Statistics::62M Inference from stochastic processes [Classificació AMS]BioclimatologyBioclimatology geostatistics parallel computation spatial prediction:62 Statistics::62P Applications [Classificació AMS]62F15 62M30 62P10 62P12 86A32Estadística bayesiana:Matemàtiques i estadística::Estadística matemàtica [Àrees temàtiques de la UPC]spatial prediction:62 Statistics::62F Parametric inference [Classificació AMS]geostatistics:86 Geophysics [Classificació AMS]parallel computation

researchProduct

Iterative sparse matrix-vector multiplication for accelerating the block Wiedemann algorithm over GF(2) on multi-graphics processing unit systems

2012

SUMMARY The block Wiedemann (BW) algorithm is frequently used to solve sparse linear systems over GF(2). Iterative sparse matrix–vector multiplication is the most time-consuming operation. The necessity to accelerate this step is motivated by the application of BW to very large matrices used in the linear algebra step of the number field sieve (NFS) for integer factorization. In this paper, we derive an efficient CUDA implementation of this operation by using a newly designed hybrid sparse matrix format. This leads to speedups between 4 and 8 on a single graphics processing unit (GPU) for a number of tested NFS matrices compared with an optimized multicore implementation. We further present…

Block Wiedemann algorithmComputer Networks and CommunicationsComputer scienceGraphics processing unitSparse matrix-vector multiplicationGPU clusterParallel computingGF(2)Computer Science ApplicationsTheoretical Computer ScienceGeneral number field sieveMatrix (mathematics)Computational Theory and MathematicsFactorizationLinear algebraMultiplicationComputer Science::Operating SystemsSoftwareInteger factorizationSparse matrixConcurrency and Computation: Practice and Experience

researchProduct

Sequential Intensification of Metformin Treatment in Type 2 Diabetes With Liraglutide Followed by Randomized Addition of Basal Insulin Prompted by A1…

2012

OBJECTIVE We evaluated the addition of liraglutide to metformin in type 2 diabetes followed by intensification with basal insulin (detemir) if glycated hemoglobin (A1C) ≥7%. RESEARCH DESIGN AND METHODS In 988 participants from North America and Europe uncontrolled on metformin ± sulfonylurea, sulfonylurea was discontinued and liraglutide 1.8 mg/day added for 12 weeks (run-in). Subsequently, those with A1C ≥7% were randomized 1:1 to 26 weeks’ open-label addition of insulin detemir to metformin + liraglutide (n = 162) or continuation without insulin detemir (n = 161). Patients achieving A1C &lt;7% continued unchanged treatment (observational arm). The primary end point was A1C change bet…

Blood GlucoseMaleEXENATIDEendocrine system diseasesdiabetes liraglutide metfortmin hypoglycemiaEndocrinology Diabetes and Metabolismmedicine.medical_treatmentType 2 diabetesTHERAPYGastroenterologyMELLITUSInsulin DetemirGlucagon-Like Peptide 1GLYCEMIC CONTROLOriginal ResearchInsulin detemirAged 80 and overClinical Care/Education/Nutrition/Psychosocial ResearchTREATED PATIENTSMiddle AgedMetforminMetforminNPH INSULINInsulin Long-ActingFemaleLife Sciences & Biomedicinehormones hormone substitutes and hormone antagonistsmedicine.drugAdultmedicine.medical_specialtyPARALLEL-GROUPAdolescentmedicine.drug_classHypoglycemiaEndocrinology & MetabolismDiabetes mellitusInternal medicineInternal MedicinemedicineHumansHypoglycemic AgentsCOMBINATIONAgedGlycated HemoglobinAdvanced and Specialized NursingScience & TechnologyLiraglutidebusiness.industryInsulin26-WEEKnutritional and metabolic diseasesLiraglutideEFFICACYmedicine.diseaseSulfonylureaEndocrinologyDiabetes Mellitus Type 2business

researchProduct

A Fast GPU-Based Motion Estimation Algorithm for H.264/AVC

2012

H.264/AVC is the most recent predictive video compression standard to outperform other existing video coding standards by means of higher computational complexity. In recent years, heterogeneous computing has emerged as a cost-efficient solution for high-performance computing. In the literature, several algorithms have been proposed to accelerate video compression, but so far there have not been many solutions that deal with video codecs using heterogeneous systems. This paper proposes an algorithm to perform H.264/AVC inter prediction. The proposed algorithm performs the motion estimation, both with full-pixel and sub-pixel accuracy, using CUDA to assist the CPU, obtaining remarkable time …

CUDAComputational complexity theoryComputer scienceMotion estimationComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONCodecSymmetric multiprocessor systemImage processingData_CODINGANDINFORMATIONTHEORYCentral processing unitParallel computingData compression

researchProduct

CRiSPy-CUDA: Computing Species Richness in 16S rRNA Pyrosequencing Datasets with CUDA

2011

Pyrosequencing technologies are frequently used for sequencing the 16S rRNA marker gene for metagenomic studies of microbial communities. Computing a pairwise genetic distance matrix from the produced reads is an important but highly time consuming task. In this paper, we present a parallelized tool (called CRiSPy) for scalable pairwise genetic distance matrix computation and clustering that is based on the processing pipeline of the popular ESPRIT software package. To achieve high computational efficiency, we have designed massively parallel CUDA algorithms for pairwise k-mer distance and pairwise genetic distance computation. We have also implemented a memory-efficient sparse matrix clust…

CUDADistance matrixComputer scienceMetagenomicsPipeline (computing)Pairwise comparisonParallel computingCluster analysisQuantitative Biology::GenomicsMassively parallelSparse matrix

researchProduct

COMPARISON OF CPML IMPLEMENTATIONS FOR THE GPU-ACCELERATED FDTD SOLVER

2011

Three distinctively difierent implementations of convolu- tional perfectly matched layer for the FDTD method on CUDA enabled graphics processing units are presented. All implementations store ad- ditional variables only inside the convolutional perfectly matched lay- ers, and the computational speeds scale according to the thickness of these layers. The merits of the difierent approaches are discussed, and a comparison of computational performance is made using complex real-life benchmarks.

CUDAPerfectly matched layerScale (ratio)Computer scienceFinite-difference time-domain methodParallel computingGraphicsSolverCondensed Matter PhysicsImplementationElectronic Optical and Magnetic MaterialsComputational scienceProgress In Electromagnetics Research M

researchProduct

CUSHAW Suite: Parallel and Efficient Algorithms for NGS Read Alignment

2017

Next generation sequencing (NGS) technologies have enabled cheap, large-scale, and high-throughput production of short DNA sequence reads and thereby have promoted the explosive growth of data volume. Unfortunately, the produced reads are short and prone to contain errors that are incurred during sequencing cycles. Both large data volume and sequencing errors have complicated the mapping of NGS reads onto the reference genome and have motivated the development of various aligners for very short reads, typically less than 100 base pairs (bps) in length. As read length continues to increase, propelled by advances in NGS technologies, these longer reads tend to have higher sequencing error rat…

CUDASoftware suiteComputer scienceSuiteVolume (computing)Human genomeParallel computingBioinformaticsGenomeDNA sequencingReference genome

researchProduct

Parallelized Clustering of Protein Structures on CUDA-Enabled GPUs

2014

Estimation of the pose in which two given molecules might bind together to form a potential complex is a crucial task in structural biology. To solve this so-called "docking problem", most algorithms initially generate large numbers of candidate poses (or decoys) which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates ranges from thousands to millions, performing the clustering on standard CPUs is highly time consuming. In this paper we analyze and evaluate different approaches to parallelize the nearest neighbor chain algorithm to perform hierarchical Ward clustering of protein structures usin…

CUDASpeedupComputer scienceNearest-neighbor chain algorithmParallel computingCluster analysisRoot-mean-square deviationPoseWard's methodHierarchical clustering2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing

researchProduct

Computational Methods for Gene Expression Profiling Using Next-Generation Sequencing (RNA-Seq)

2014

Cancer genome sequencingMassive parallel sequencingSingle cell sequencingComputational biologyBiologyBioinformaticsDeep sequencingExome sequencingDNA sequencingIllumina dye sequencingMassively parallel signature sequencing

researchProduct