Search results for "Supercomputer"

showing 10 items of 45 documents

parSRA: A framework for the parallel execution of short read aligners on compute clusters

2018

The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework to accelerate the execution of existing short read aligners on distributed-memory systems. parSRA can be used to parallelize a variety of short read alignment tools installed in the system without any modification to their source code. We show that our framework provides good scalability on a compute cluster for accelerating the popular BWA-MEM and Bowtie2 aligners. On average, it is able to accelerate sequence alignments on 16 64-core nodes (in total, 1024 cores) with speedup of 10.48 …

0301 basic medicineSource codeSpeedupGeneral Computer ScienceComputer sciencemedia_common.quotation_subjectParallel computingSupercomputerTheoretical Computer Science03 medical and health sciences030104 developmental biology0302 clinical medicine030220 oncology & carcinogenesisModeling and SimulationComputer clusterScalabilityFuse (electrical)Node (circuits)Partitioned global address spacemedia_commonJournal of Computational Science

researchProduct

Accelerating metagenomic read classification on CUDA-enabled GPUs.

2016

Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…

0301 basic medicineTheoretical computer scienceWorkstationGPUsComputer scienceContext (language use)CUDAParallel computingBiochemistryGenomelaw.invention03 medical and health sciencesCUDAUser-Computer Interface0302 clinical medicineStructural BiologylawTaxonomic assignmentHumansMicrobiomeMolecular BiologyInternetXeonApplied MathematicsHigh-Throughput Nucleotide SequencingSequence Analysis DNAExact k-mer matchingComputer Science Applications030104 developmental biologyTitan (supercomputer)Metagenomics030220 oncology & carcinogenesisMetagenomicsDNA microarraySoftwareBMC bioinformatics

researchProduct

mD3DOCKxb: An Ultra-Scalable CPU-MIC Coordinated Virtual Screening Framework

2017

Molecular docking is an important method in computational drug discovery. In large-scale virtual screening, millions of small drug-like molecules (chemical compounds) are compared against a designated target protein (receptor). Depending on the utilized docking algorithm for screening, this can take several weeks on conventional HPC systems. However, for certain applications including large-scale screening tasks for newly emerging infectious diseases such high runtimes can be highly prohibitive. In this paper, we investigate how the massively parallel neo-heterogeneous architecture of Tianhe-2 Supercomputer consisting of thousands of nodes comprising CPUs and MIC coprocessors that can effic…

0301 basic medicineVirtual screeningMulti-core processorCoprocessorComputer sciencebusiness.industryParallel computingSupercomputer03 medical and health sciences030104 developmental biologyEmbedded systemScalabilityTianhe-2Algorithm designbusinessMassively parallel2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

researchProduct

FMapper: Scalable read mapper based on succinct hash index on SunWay TaihuLight

2022

Abstract One of the most important application in bioinformatics is read mapping. With the rapidly increasing number of reads produced by next-generation sequencing (NGS) technology, there is a need for fast and efficient high-throughput read mappers. In this paper, we present FMapper – a highly scalable read mapper on the TaihuLight supercomputer optimized for its fourth-generation ShenWei many-core architecture (SW26010). In order to fully exploit the computational power of the SW26010, we employ dynamic scheduling of tasks, asynchronous I/O and data transfers and implement a vectorized version of the banded Myers algorithm tailored to the 256 bit vector registers of the SW26010. Our perf…

256-bitSpeedupXeonComputer Networks and CommunicationsComputer scienceHash functionParallel computingSW26010SupercomputerTheoretical Computer ScienceArtificial IntelligenceHardware and ArchitectureScalabilitySoftwareSunway TaihuLightJournal of Parallel and Distributed Computing

researchProduct

Many-body perturbation theory calculations using the yambo code

2019

Abstract yambo is an open source project aimed at studying excited state properties of condensed matter systems from first principles using many-body methods. As input, yambo requires ground state electronic structure data as computed by density functional theory codes such as Quantum ESPRESSO and Abinit. yambo’s capabilities include the calculation of linear response quantities (both independent-particle and including electron–hole interactions), quasi-particle corrections based on the GW formalism, optical absorption, and other spectroscopic quantities. Here we describe recent developments ranging from the inclusion of important but oft-neglected physical effects such as electron–phonon i…

BETHE-SALPETER EQUATION02 engineering and technology01 natural sciencesSoftwarereal-time dynamicsGeneral Materials Sciencequasi-particleCondensed Matter - Materials Scienceparallelismelectron-phononreal-time dynamicComputational Physics (physics.comp-ph)021001 nanoscience & nanotechnologySupercomputerMANY-BODY PERTURBATION THEORYCondensed Matter Physicsbethe-salpeter-equationoptical-propertiesoptical propertietemperature-dependence[PHYS.COND.CM-MS]Physics [physics]/Condensed Matter [cond-mat]/Materials Science [cond-mat.mtrl-sci]User interface0210 nano-technologyGround statePhysics - Computational Physicsoptical propertiesmonte-carloMaterials scienceExploitFOS: Physical sciencesabinitSettore FIS/03 - Fisica della MateriaComputational scienceKerr effect0103 physical scienceskerr effect010306 general physicselectronic excitationsTHEORETICAL SPECTROSCOPYpolarizationspin and spinorsbusiness.industrysoftwareMaterials Science (cond-mat.mtrl-sci)Rangingelectronic structureABINITInterfacingelectron-phonon; electronic structure; Kerr effect; optical properties; parallelism; real-time dynamics; spin and spinorsbusinessabsorption

researchProduct

Big Data in metagenomics: Apache Spark vs MPI.

2020

The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data technologies to process this large amount of information in distributed memory clusters of commodity hardware. Several approaches based on solutions such as Apache Hadoop or Apache Spark, have been proposed. These solutions allow developers to focus on the problem while the need to deal with low level details, such as data distribution schemes or communication patterns among processing nodes, can be ignored. However, performance and scalability are also of high importance when…

Big DataComputer and Information SciencesScienceBig dataMessage Passing InterfaceParallel computingResearch and Analysis MethodsComputing MethodologiesComputing MethodologiesComputer ArchitectureComputer SoftwareDatabase and Informatics MethodsSoftwareSpark (mathematics)GeneticsMammalian GenomicsMultidisciplinarybusiness.industryApplied MathematicsSimulation and ModelingQRBiology and Life SciencesComputational BiologySoftware EngineeringGenomicsDNAGenomic DatabasesGenome AnalysisComputer HardwareSupercomputerBiological DatabasesAnimal GenomicsPhysical SciencesScalabilityEngineering and TechnologyMetagenomeMedicineDistributed memoryMetagenomicsbusinessMathematicsAlgorithmsGenome BacterialSoftwareResearch ArticlePLoS ONE

researchProduct

Mapreduce in computational biology - A synopsis

2017

In the past 20 years, the Life Sciences have witnessed a paradigm shift in the way research is performed. Indeed, the computational part of biological and clinical studies has become central or is becoming so. Correspondingly, the amount of data that one needs to process, compare and analyze, has experienced an exponential growth. As a consequence, High Performance Computing (HPC, for short) is being used intensively, in particular in terms of multi-core architectures. However, recently and thanks to the advances in the processing of other scientific and commercial data, Distributed Computing is also being considered for Bioinformatics applications. In particular, the MapReduce paradigm, to…

BioinformaticSpark0301 basic medicineSettore INF/01 - InformaticaBioinformaticsProcess (engineering)Computer scienceComputer Science (all)Computational biologybioinformatics; distributed computing; hadoop; MapReduce; spark; computer science (all)Supercomputercomputer.software_genreDistributed computing03 medical and health sciences030104 developmental biologyExponential growthHadoopParadigm shiftMiddleware (distributed applications)Spark (mathematics)MapReducecomputer

researchProduct

Comparison of implementations of the lattice-Boltzmann method

2008

AbstractSimplicity of coding is usually an appealing feature of the lattice-Boltzmann method (LBM). Conventional implementations of LBM are often based on the two-lattice or the two-step algorithm, which however suffer from high memory consumption and poor computational performance, respectively. The aim of this work was to identify implementations of LBM that would achieve high computational performance with low memory consumption. Effects of memory addressing schemes were investigated in particular. Data layouts for velocity distribution values were also considered, and they were found to be related to computational performance. A novel bundle data layout was therefore introduced. Address…

Computational fluid mechanicsMemory addressing schemesComputer scienceLattice Boltzmann methodsParallel computingSupercomputerAddressing modeHigh memoryMemory addressComputational MathematicsComputational Theory and MathematicsModeling and SimulationBundleModelling and SimulationLattice-Boltzmann methodImplementationHigh-performance computingCoding (social sciences)Computers & Mathematics with Applications

researchProduct

GROMEX: A Scalable and Versatile Fast Multipole Method for Biomolecular Simulation

2020

Atomistic simulations of large biomolecular systems with chemical variability such as constant pH dynamic protonation offer multiple challenges in high performance computing. One of them is the correct treatment of the involved electrostatics in an efficient and highly scalable way. Here we review and assess two of the main building blocks that will permit such simulations: (1) An electrostatics library based on the Fast Multipole Method (FMM) that treats local alternative charge distributions with minimal overhead, and (2) A $λ$-dynamics module working in tandem with the FMM that enables various types of chemical transitions during the simulation. Our $λ$-dynamics and FMM implementations d…

Computer scienceFast multipole method05 social sciencesFast Fourier transform050301 educationSupercomputerElectrostaticsbiomolekyylitComputational scienceMolecular dynamicsCUDAsähköstatiikkaParticle MeshScalabilityOverhead (computing)simulointi0501 psychology and cognitive sciencesSIMD0503 education050104 developmental & child psychology

researchProduct

Towards human cell simulation

2019

The faithful reproduction and accurate prediction of the phe-notypes and emergent behaviors of complex cellular systems are among the most challenging goals in Systems Biology. Although mathematical models that describe the interactions among all biochemical processes in a cell are theoretically feasible, their simulation is generally hard because of a variety of reasons. For instance, many quantitative data (e.g., kinetic rates) are usually not available, a problem that hinders the execution of simulation algorithms as long as some parameter estimation methods are used. Though, even with a candidate parameterization, the simulation of mechanistic models could be challenging due to the extr…

Constraint-based modelingAgent-based simulation; Big data; Biochemical simulation; Computational intelligence; Constraint-based modeling; Fuzzy logic; High-performance computing; Model reduction; Multi-scale modeling; Parameter estimation; Reaction-based modeling; Systems biology; Theoretical Computer Science; Computer Science (all)Computer scienceBiochemical simulationDistributed computingSystems biologyBig dataComputational intelligenceContext (language use)ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONITheoretical Computer ScienceReduction (complexity)Big dataParameter estimationHigh-performance computingComputational intelligenceAgent-based simulationMathematical modelbusiness.industryModel reductionComputer Science (all)Multi-scale modelingINF/01 - INFORMATICASupercomputerVariety (cybernetics)Fuzzy logicReaction-based modelingbusinessSystems biology

researchProduct