Search results for "Parallel"

showing 10 items of 667 documents

Global emergence of the widespread Pseudomonas aeruginosa ST235 clone

2018

Abstract Objectives Despite the non-clonal epidemic population structure of Pseudomonas aeruginosa , several multi-locus sequence types are distributed worldwide and are frequently associated with epidemics where multidrug resistance confounds treatment. ST235 is the most prevalent of these widespread clones. In this study we aimed to understand the origin of ST235 and the molecular basis for its success. Methods The genomes of 79 P. aeruginosa ST235 isolates collected worldwide over a 27-year period were examined. A phylogenetic network was built, using a Bayesian approach to find the Most Recent Common Ancestor, and we identified antibiotic resistance determinants and ST235-specific genes…

0301 basic medicineMost recent common ancestorClone (cell biology)[ SDV.MP.BAC ] Life Sciences [q-bio]/Microbiology and Parasitology/Bacteriologymedicine.disease_causeGlobal HealthGenome[ SDV.MP ] Life Sciences [q-bio]/Microbiology and ParasitologyPrevalenceCluster Analysis[ SDV.BIBS ] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]High-risk clonesPhylogenyComputingMilieux_MISCELLANEOUSMolecular EpidemiologyGeneral Medicine3. Good healthInfectious Diseases[SDV.MP]Life Sciences [q-bio]/Microbiology and Parasitology[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA][ SDV.BBM.GTP ] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Pseudomonas aeruginosaEfflux[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]FluoroquinolonesMicrobiology (medical)Genotype030106 microbiologyEpidemic[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiologyBacterial resistanceMicrobiology[INFO.INFO-IU]Computer Science [cs]/Ubiquitous ComputingEvolution Molecular03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]Antibiotic resistanceDrug Resistance BacterialmedicinePseudomonas InfectionsGenePseudomonas aeruginosaPathogenInternational clones[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationMultiple drug resistanceGenes Bacterial[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Multilocus Sequence Typing
researchProduct

A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.

2018

International audience; In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clust…

0301 basic medicineNematoda01 natural sciencesGaussian Mixture Model[STAT.ML]Statistics [stat]/Machine Learning [stat.ML][MATH.MATH-ST]Mathematics [math]/Statistics [math.ST]ComputingMilieux_MISCELLANEOUScomputer.programming_language[STAT.AP]Statistics [stat]/Applications [stat.AP]Phylogenetic treeDNA ClusteringGenomicsHelminth ProteinsComputer Science Applications[STAT]Statistics [stat]010201 computation theory & mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Data analysisEmbeddingA priori and a posteriori[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Health Informatics0102 computer and information sciences[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]Biology[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]Laplacian EigenmapsAnimalsCluster analysis[SDV.GEN]Life Sciences [q-bio]/GeneticsModels Geneticbusiness.industryPattern recognitionNADH DehydrogenaseSequence Analysis DNAPython (programming language)Mixture model[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationVisualization030104 developmental biologyComputingMethodologies_PATTERNRECOGNITIONPlatyhelminths[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Programming LanguagesArtificial intelligence[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]businesscomputerComputers in biology and medicine
researchProduct

2016

The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown to obtain high quality mappings at the expense of long execution times. Finding fast yet accurate software solutions is of high importance to research, since availability and size of NGS datasets continue to increase. In this work we present an efficient parallelization approach for NGS short-read alignment on multi-core clusters. Our approach takes advantage of a distributed shared memory programming model based on the new UPC++ language. Experimental results using the CUSHAW3 alig…

0301 basic medicinePhysics020203 distributed computingMulti-core processorDistributed shared memoryMultidisciplinarySource codemedia_common.quotation_subjectNode (networking)02 engineering and technologyDynamic priority schedulingParallel computingBioinformatics03 medical and health sciences030104 developmental biologyScalability0202 electrical engineering electronic engineering information engineeringProgramming paradigmPartitioned global address spacemedia_commonPLOS ONE
researchProduct

parSRA: A framework for the parallel execution of short read aligners on compute clusters

2018

The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework to accelerate the execution of existing short read aligners on distributed-memory systems. parSRA can be used to parallelize a variety of short read alignment tools installed in the system without any modification to their source code. We show that our framework provides good scalability on a compute cluster for accelerating the popular BWA-MEM and Bowtie2 aligners. On average, it is able to accelerate sequence alignments on 16 64-core nodes (in total, 1024 cores) with speedup of 10.48 …

0301 basic medicineSource codeSpeedupGeneral Computer ScienceComputer sciencemedia_common.quotation_subjectParallel computingSupercomputerTheoretical Computer Science03 medical and health sciences030104 developmental biology0302 clinical medicine030220 oncology & carcinogenesisModeling and SimulationComputer clusterScalabilityFuse (electrical)Node (circuits)Partitioned global address spacemedia_commonJournal of Computational Science
researchProduct

CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm

2015

Clustering of molecular systems according to their three-dimensional structure is an important step in many bioinformatics workflows. In applications such as docking or structure prediction, many algorithms initially generate large numbers of candidate poses (or decoys), which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates can easily range from thousands to millions, performing the clustering on standard central processing units (CPUs) is highly time consuming. In this paper, we analyse and evaluate different approaches to parallelize the nearest neighbour chain algorithm to perform hierarc…

0301 basic medicineSpeedupComputer scienceCorrelation clusteringParallel computingTheoretical Computer Science03 medical and health sciencesCUDA030104 developmental biologyHardware and ArchitectureCluster analysisAlgorithmSoftwareWard's methodThe International Journal of High Performance Computing Applications
researchProduct

ParDRe: faster parallel duplicated reads removal tool for sequencing studies

2016

This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record [insert complete citation information here] is available online at: https://doi.org/10.1093/bioinformatics/btw038 [Abstract] Summary: Current next generation sequencing technologies often generate duplicated or near-duplicated reads that (depending on the application scenario) do not provide any interesting biological information but can increase memory requirements and computational time of downstream analysis. In this work we present ParDRe , a de novo parallel tool to remove duplicated and near-duplicated reads through the clustering of S…

0301 basic medicineStatistics and ProbabilityFASTQ formatDNA stringsSource codeDownstream (software development)Computer sciencemedia_common.quotation_subjectParallel computingcomputer.software_genreBiochemistryDNA sequencing03 medical and health scienceschemistry.chemical_compound0302 clinical medicineHybrid MPI/multithreadingCluster AnalysisParDReMolecular BiologyGenemedia_commonHigh-Throughput Nucleotide SequencingSequence Analysis DNAParallel toolComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicschemistryData miningcomputerAlgorithms030217 neurology & neurosurgeryDNABioinformatics
researchProduct

panISa: ab initio detection of insertion sequences in bacterial genomes from short read sequence data.

2018

Abstract Motivation The advent of next-generation sequencing has boosted the analysis of bacterial genome evolution. Insertion sequence (IS) elements play a key role in prokaryotic genome organization and evolution, but their repetitions in genomes complicate their detection from short-read data. Results PanISa is a software pipeline that identifies IS insertions ab initio in bacterial genomes from short-read data. It is a highly sensitive and precise tool based on the detection of read-mapping patterns at the insertion site. PanISa performs better than existing IS detection systems as it is based on a database-free approach. We applied it to a high-risk clone lineage of the pathogenic spec…

0301 basic medicineStatistics and ProbabilityLineage (genetic)Computer scienceAb initioComputational biologyBacterial genome size[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiochemistryGenome[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR][SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Insertion sequenceMolecular BiologyGenomic organizationHigh-Throughput Nucleotide SequencingSequence Analysis DNA[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM][SDV.MP.BAC]Life Sciences [q-bio]/Microbiology and Parasitology/BacteriologyPipeline (software)[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and Mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]DNA Transposable Elements[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Genome BacterialSoftwareBioinformatics (Oxford, England)
researchProduct

MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

2016

This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558 [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-sca…

0301 basic medicineStatistics and ProbabilitySource codeComputer sciencemedia_common.quotation_subject02 engineering and technologyParallel computingcomputer.software_genreBiochemistryExecution time03 medical and health sciences0202 electrical engineering electronic engineering information engineeringCluster (physics)Point (geometry)Amino Acid SequenceMolecular Biologymedia_commonSequenceMultiple sequence alignmentProtein multiple sequenceComputational BiologyProteinsMarkov ChainsComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsDistributed memory systemsMSAProbs020201 artificial intelligence & image processingMPIData miningSequence AlignmentcomputerAlgorithmsSoftware
researchProduct

Simulation-based estimation of branching models for LTR retrotransposons

2017

Abstract Motivation LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows us to take into account both the positions and the degradation level of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we have developed a first method to evaluate the parameters of this propagation …

0301 basic medicineStatistics and ProbabilitySource codeTheoretical computer scienceRetroelementsmedia_common.quotation_subjectRetrotransposon[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiologyBiochemistryGenomeChromosomesBranching (linguistics)[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]SoftwareAnimalsComputer SimulationMolecular BiologyComputingMilieux_MISCELLANEOUSmedia_commoncomputer.programming_languageGeneticsGenomeModels Geneticbusiness.industry[SDV.BID.EVO]Life Sciences [q-bio]/Biodiversity/Populations and Evolution [q-bio.PE]Python (programming language)[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM][INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationComputer Science ApplicationsVisualizationComputational Mathematics030104 developmental biologyDrosophila melanogasterComputational Theory and Mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Programming Languages[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Mobile genetic elements[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]businesscomputerSoftware
researchProduct

Parallel and Space-Efficient Construction of Burrows-Wheeler Transform and Suffix Array for Big Genome Data

2016

Next-generation sequencing technologies have led to the sequencing of more and more genomes, propelling related research into the era of big data. In this paper, we present ParaBWT, a parallelized Burrows-Wheeler transform (BWT) and suffix array construction algorithm for big genome data. In ParaBWT, we have investigated a progressive construction approach to constructing the BWT of single genome sequences in linear space complexity, but with a small constant factor. This approach has been further parallelized using multi-threading based on a master-slave coprocessing model. After gaining the BWT, the suffix array is constructed in a memory-efficient manner. The performance of ParaBWT has b…

0301 basic medicineTheoretical computer scienceBurrows–Wheeler transformComputer scienceGenomicsData_CODINGANDINFORMATIONTHEORYParallel computingGenomelaw.invention03 medical and health scienceslawGeneticsHumansEnsemblMulti-core processorApplied MathematicsLinear spaceSuffix arrayChromosome MappingHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNA030104 developmental biologyAlgorithmsBiotechnologyReference genomeIEEE/ACM Transactions on Computational Biology and Bioinformatics
researchProduct