Search results for "Parallel computing"

showing 10 items of 189 documents

A Methodology for the Analysis of Memory Response to Radiation through Bitmap Superposition and Slicing

2015

A methodology is proposed for the statistical analysis of memory radiation test data, with the aim of identifying trends in the single-even upset (SEU) distribution. The treated case study is a 65nm SRAM irradiated with neutrons, protons and heavy-ions.

Computer sciencebitmap slicingParallel computingHardware_PERFORMANCEANDRELIABILITYRadiationSlicingUpsetElectronic mailSuperposition principleStatic random-access memoryMemoriesstatic testNuclear Experimentdynamic testta114ta213computer.file_formatSRAMBitmap[SPI.TRON]Engineering Sciences [physics]/ElectronicsMultiple Cell Upset (MCU)MCUSERBitmapradiation testevent accumulationSingle Event Upset (SEU)AlgorithmcomputerSEUTest data
researchProduct

Numerical experiments with a parallel fast direct elliptic solver on Cray T3E

1997

A parallel fast direct O(N log N) solver is shortly described for linear systems with separable block tridiagonal matrices. A good parallel scalability of the proposed method is demonstrated on a Cray T3E parallel computer using MPI in communication. Also, the sequential performance is compared with the well-known BLKTRI-implementation of the generalized. cyclic reduction method using a single processor of Cray T3E.

ComputerSystemsOrganization_COMPUTERSYSTEMIMPLEMENTATIONTridiagonal matrixComputer scienceLinear systemMathematicsofComputing_NUMERICALANALYSISParallel algorithmParallel computingComputerSystemsOrganization_PROCESSORARCHITECTURESSolverMatrix (mathematics)ScalabilityPoisson's equationTime complexityCyclic reductionBlock (data storage)
researchProduct

Parallelization strategies for density matrix renormalization group algorithms on shared-memory systems

2003

Shared-memory parallelization (SMP) strategies for density matrix renormalization group (DMRG) algorithms enable the treatment of complex systems in solid state physics. We present two different approaches by which parallelization of the standard DMRG algorithm can be accomplished in an efficient way. The methods are illustrated with DMRG calculations of the two-dimensional Hubbard model and the one-dimensional Holstein-Hubbard model on contemporary SMP architectures. The parallelized code shows good scalability up to at least eight processors and allows us to solve problems which exceed the capability of sequential DMRG calculations.

Condensed Matter::Quantum GasesDensity matrixNumerical AnalysisStrongly Correlated Electrons (cond-mat.str-el)Physics and Astronomy (miscellaneous)Hubbard modelApplied MathematicsDensity matrix renormalization groupComplex systemFOS: Physical sciencesParallel computingRenormalization groupComputer Science ApplicationsCondensed Matter - Strongly Correlated ElectronsComputational MathematicsShared memoryModeling and SimulationScalabilityCode (cryptography)Condensed Matter::Strongly Correlated ElectronsAlgorithmMathematicsJournal of Computational Physics
researchProduct

Parallelization of a Lattice Boltzmann Suspension Flow Solver

2002

We have applied a parallel Lattice Boltzmann method to solve the behaviour of the suspension flow. The complex behaviour of the suspension flow cannot be solved by analytical methods, so simulations are the only way to study it. Usually the size of an interesting problem is so big that calculation time on one processor is too long, and this can be solved by parallel program. We have written a parallel suspension flow solver and tested it on massive parallel computers. The measured performance of our program show that the parallelization of suspension particles was successful. We also show that over one million particles can be simulated.

Condensed Matter::Soft Condensed MatterComputer scienceLattice (order)Suspension flowParallel algorithmLattice Boltzmann methodsCollision detectionParallel computingSolverComputational science
researchProduct

Cell-List based Molecular Dynamics on Many-Core Processors: A Case Study on Sunway TaihuLight Supercomputer

2020

Molecular dynamics (MD) simulations are playing an increasingly important role in several research areas. The most frequently used potentials in MD simulations are pair-wise potentials. Due to the memory wall, computing pair-wise potentials on many-core processors are usually memory bounded. In this paper, we take the SW26010 processor as an exemplary platform to explore the possibility to break the memory bottleneck by improving data reusage via cell-list-based methods. We use cell-lists instead of neighbor-lists in the potential computation, and apply a number of novel optimization methods. Theses methods include: an adaptive replica arrangement strategy, a parameter profile data structur…

CoprocessorCell lists010304 chemical physicsComputer scienceReplica020207 software engineering02 engineering and technologyParallel computingSupercomputerData structure01 natural sciencesBottleneckMolecular dynamics0103 physical sciencesScalability0202 electrical engineering electronic engineering information engineeringSunway TaihuLightSC20: International Conference for High Performance Computing, Networking, Storage and Analysis
researchProduct

Pairwise DNA Sequence Alignment Optimization

2015

This chapter presents a parallel implementation of the Smith-Waterman algorithm to accelerate the pairwise alignment of DNA sequences. This algorithm is especially computationally demanding for long DNA sequences. Parallelization approaches are examined in order to deeply explore the inherent parallelism within Intel Xeon Phi coprocessors. This chapter looks at exploiting instruction-level parallelism within 512-bit single instruction multiple data instructions (vectorization) as well as thread-level parallelism over the many cores (multithreading using OpenMP). Between coprocessors, device-level parallelism through the compute power of clusters including Intel Xeon Phi coprocessors using M…

CoprocessorComputer scienceMultithreadingVectorization (mathematics)Parallelism (grammar)SIMDParallel computingHardware_ARITHMETICANDLOGICSTRUCTURESComputerSystemsOrganization_PROCESSORARCHITECTURESIntrinsicsInstruction-level parallelismXeon Phi
researchProduct

Versatile Direct and Transpose Matrix Multiplication with Chained Operations: An Optimized Architecture Using Circulant Matrices

2016

With growing demands in real-time control, classification or prediction, algorithms become more complex while low power and small size devices are required. Matrix multiplication (direct or transpose) is common for such computation algorithms. In numerous algorithms, it is also required to perform matrix multiplication repeatedly, where the result of a multiplication is further multiplied again. This work describes a versatile computation procedure and architecture: one of the matrices is stored in internal memory in its circulant form, then, a sequence of direct or transpose multiplications can be performed without timing penalty. The architecture proposes a RAM-ALU block for each matrix c…

Cycles per instructionBlock matrix020206 networking & telecommunications02 engineering and technologyParallel computingMatrix chain multiplicationMatrix multiplication020202 computer hardware & architectureTheoretical Computer ScienceMatrix (mathematics)Computational Theory and MathematicsHardware and ArchitectureTranspose0202 electrical engineering electronic engineering information engineeringMultiplicationHardware_ARITHMETICANDLOGICSTRUCTURESArithmeticCirculant matrixSoftwareMathematicsIEEE Transactions on Computers
researchProduct

Etude numérique d'équations aux dérivées partielles non linéaires et dispersives

2011

Numerical analysis becomes a powerful resource in the study of partial differential equations (PDEs), allowing to illustrate existing theorems and find conjectures. By using sophisticated methods, questions which seem inaccessible before, like rapid oscillations or blow-up of solutions can be addressed in an approached way. Rapid oscillations in solutions are observed in dispersive PDEs without dissipation where solutions of the corresponding PDEs without dispersion present shocks. To solve numerically these oscillations, the use of efficient methods without using artificial numerical dissipation is necessary, in particular in the study of PDEs in some dimensions, done in this work. As stud…

Davey-Stewartson systems[ MATH.MATH-GM ] Mathematics [math]/General Mathematics [math.GM]equations dispersivesdispersive shocksexponential time-differencing[MATH.MATH-GM]Mathematics [math]/General Mathematics [math.GM][MATH.MATH-MP]Mathematics [math]/Mathematical Physics [math-ph]spectral methodschocs dispersifsnumerical methodsdispersive equationsNo english keywordssplit stepschemas de decomposition d'operateursmethodes spectrales[MATH.MATH-MP] Mathematics [math]/Mathematical Physics [math-ph]Kadomtsev-Petviashvili equationintegrating factor methodparallel computing[ MATH.MATH-MP ] Mathematics [math]/Mathematical Physics [math-ph]Pas de mot clé en français[MATH.MATH-GM] Mathematics [math]/General Mathematics [math.GM]methodes numeriquesblow upequation de Kadomtsev-PetviashviliIntegrateurs exponentielssystemes de Davey-Stewartsoncalcul parallele
researchProduct

Random Slicing: Efficient and Scalable Data Placement for Large-Scale Storage Systems

2014

The ever-growing amount of data requires highly scalable storage solutions. The most flexible approach is to use storage pools that can be expanded and scaled down by adding or removing storage devices. To make this approach usable, it is necessary to provide a solution to locate data items in such a dynamic environment. This article presents and evaluates the Random Slicing strategy, which incorporates lessons learned from table-based, rule-based, and pseudo-randomized hashing strategies and is able to provide a simple and efficient strategy that scales up to handle exascale data. Random Slicing keeps a small table with information about previous storage system insert and remove operations…

DesignComputer scienceDistributed computingPerformancestorage managementHash function0102 computer and information sciences02 engineering and technologyParallel computingUSable01 natural sciencesSlicingrandomized data distributionAffordable and Clean Energy0202 electrical engineering electronic engineering information engineeringRandomnessExperimentationscalabilityPseudorandom number generatorbusiness.industry020206 networking & telecommunicationsReliabilityData FormatPRNG010201 computation theory & mathematicsHardware and ArchitectureComputer data storageScalabilityTable (database)businessNetworking & Telecommunications
researchProduct

Randomized renaming in shared memory systems.

2021

Abstract Renaming is a task in distributed computing where n processes are assigned new names from a name space of size m . The problem is called tight if m = n , and loose if m > n . In recent years renaming came to the fore again and new algorithms were developed. For tight renaming in asynchronous shared memory systems, Alistarh et al. describe a construction based on the AKS network that assigns all names within O ( log n ) steps per process. They also show that, depending on the size of the name space, loose renaming can be done considerably faster. For m = ( 1 + ϵ ) ⋅ n and constant ϵ , they achieve a step complexity of O ( log log n ) . In this paper we consider tight as well as loos…

Discrete mathematicsShared memory modelSpeedupComputer Networks and CommunicationsComputer science020206 networking & telecommunications02 engineering and technologyParallel computingTheoretical Computer ScienceRandomized algorithmTask (computing)Constant (computer programming)Shared memoryArtificial IntelligenceHardware and ArchitectureAsynchronous communicationDistributed algorithm0202 electrical engineering electronic engineering information engineeringOverhead (computing)020201 artificial intelligence & image processingSoftware
researchProduct