Search results for "parallel computing"

showing 10 items of 189 documents

A Low Cost Solution for 2D Memory Access

2006

Many of the new coding tools in the H.264/AVC video coding standard are based on 2D processing resulting in row-wise and column-wise memory accesses starting from arbitrary memory locations. This paper proposes a low cost solution for efficient realization of these 2D block memory accesses on sub-word parallel processors. It is based on the use of simple register-based data permutation networks placed between the processor and memory. The data rearrangement capabilities of the networks can further be extended with more complex control schemes. With the proposed control schemes, the networks enable row and column accesses from arbitrary memory locations for blocks of data while maintaining f…

Flat memory modelShared memoryComputer scienceInterleaved memoryRegistered memoryUniform memory accessSemiconductor memoryDistributed memoryParallel computingMemory map2006 49th IEEE International Midwest Symposium on Circuits and Systems

researchProduct

A mixed geometric-systolic approach to parallel molecular dynamics simulations

1995

We have developed a flexible and efficient method of performing molecular dynamics simulations on distributed memory parallel computers. The novel feature is to use simultaneously spatial partitioning and systolic loop approaches according to a strategy which, for a given simulation, adapts itself to the multiprocessor system, allowing to approach optimal performance. The method assures high efficiencies even in situations in which, due to the exceeding large number of processors, the usage of a pure spatial decomposition would be impossible. The algorithm provides as particular cases both the pure spatial partitioning and the pure systolic parallelization schemes, so that its adoption assu…

Flexibility (engineering)Loop (graph theory)Hardware and ArchitectureComputer scienceFeature (computer vision)Numerical analysisDecomposition (computer science)General Physics and AstronomyDistributed memoryMultiprocessingParallel computingSpace partitioningComputer Physics Communications

researchProduct

LARGE-SCALE SIMULATIONS IN CONDENSED MATTER PHYSICS —THE NEED FOR A TERAFLOP COMPUTER

1992

The introduction of vector processors {“supercomputers” with a performance in the range of 109 floating point operations (1 GFLOP) per second} has had an enormous impact on computational condensed matter physics. The possibility of a substantially enhanced performance by massively parallel processors (“teraflop” machines with 1012 floating point operations per second) will allow satisfactory treatment of a large range of important scientific problems which have to a great extent thus far escaped numerical resolution. The present paper describes only a few examples (out of a long list of interesting research problems!) for which the availability of “teraflops” will allow spectacular progres…

Floating pointCondensed matter physicsComputer scienceScale (chemistry)Monte Carlo methodGeneral Physics and AstronomyStatistical and Nonlinear PhysicsParallel computingLarge rangeFLOPSComputer Science ApplicationsMetallic alloyRange (mathematics)Computational Theory and MathematicsMassively parallelMathematical PhysicsInternational Journal of Modern Physics C

researchProduct

Accelerating bioinformatics applications via emerging parallel computing systems [Guest editorial]

2015

The papers in this issue focus on advanced parallel computing systems for bioinformatics applications. This papers provide a forum to publish recent advances in the improvement of handling bioinformatics problems on emerging parallel computing systems. These systems can be characterized by exploiting different types of parallelism, including fine-grained versus coarse-grained and thread-level parallelism versus datalevel parallelism versus request-level parallelism. Hence, parallel computing systems based on multi- and many-core CPUs, many-core GPUs, vector processors, or FPGAs offer the promise to massively accelerate many bioinformatics algorithms and applications, ranging from computeint…

Focus (computing)Parallelism (rhetoric)Computer sciencebusiness.industryApplied MathematicsCloud computingParallel computingBioinformaticsComputing MethodologiesGeneticsData-intensive computingUnconventional computingbusinessField-programmable gate arrayMassively parallelBiotechnologyIEEE/ACM Transactions on Computational Biology and Bioinformatics

researchProduct

Use of parallel computing to improve the accuracy of calculated molecular properties

1998

Calculation of electron correlation energy in molecules is unavoidable in accurate studies of chemical reactivity. However, these calculations involve, a computational effort several, even in the simplest cases, orders of magnitude larger than the computer power nowadays available. In this work the possibility of parallelize the calculations of the electron correlation energy is studied. The formalism chosen is the dressing of matrices in both distributed and shared memory parallel systems MIMD. Algorithms developed on PVM are presented, and the results are evaluated on several platforms. These results show that the parallel techniques are useful in order to decrease very appreciably the ti…

Formalism (philosophy of mathematics)Matrix (mathematics)MIMDShared memoryElectronic correlationComputer scienceParallel computing

researchProduct

A prospect for computing in porous materials research: Very large fluid flow simulations

2016

Abstract Properties of porous materials, abundant both in nature and industry, have broad influences on societies via, e.g. oil recovery, erosion, and propagation of pollutants. The internal structure of many porous materials involves multiple scales which hinders research on the relation between structure and transport properties: typically laboratory experiments cannot distinguish contributions from individual scales while computer simulations cannot capture multiple scales due to limited capabilities. Thus the question arises how large domain sizes can in fact be simulated with modern computers. This question is here addressed using a realistic test case; it is demonstrated that current …

General Computer ScienceComputer scienceLattice Boltzmann method0208 environmental biotechnologyGPULattice Boltzmann methods02 engineering and technologyParallel computing01 natural sciencesPermeability010305 fluids & plasmasTheoretical Computer ScienceComputational sciencePorous materialPetascale computing0103 physical sciencesFluid dynamicsFluid flow simulationPorosityta113ta114Supercomputer020801 environmental engineeringAddressing modePermeability (earth sciences)Petascale computingModeling and SimulationPorous mediumJournal of Computational Science

researchProduct

On the Use of GPU for Accelerating Communication-Aware Mapping Techniques

2015

Different communication-aware mapping techniques were proposed in recent years for improving the performance of distributed systems based on both, off-chip and on-chip networks. Some of these proposals were based on heuristic search for finding pseudo-optimal assignments of tasks and processing elements. However, the technology integration improvements have allowed a significant increase in the number of network nodes, requiring the acceleration of the heuristic search. In this paper, we propose a comparative study of the local search method used in a communication-aware mapping technique, when implemented on different parallel architectures. We compare the performance provided by a version…

General Computer Sciencebusiness.industryComputer scienceGraphics processing unit02 engineering and technologyParallel computingSupercomputer020202 computer hardware & architectureAcceleration0202 electrical engineering electronic engineering information engineeringTechnology integration020201 artificial intelligence & image processingLocal search (optimization)Mapping techniquesArchitecturebusinessThe Computer Journal

researchProduct

GPU-laskennan optimointi

2013

Näytönohjaimet, grafiikkasuorittimet, tarjoavat rinnakkaisen laskennan alustan, jossa voidaan suorittaa ohjelmakoodia satojen ydinten toimesta. Tämä alusta mahdollistaa matemaattisesti työläiden ongelmien ratkaisemisen tehokkaasti. Grafiikkasuorittimen rinnakkainen suoritusympäristö kuitenkin eroaa suuresti tietokoneen suorittimen peräkkäisestä suoritusympäristöstä. Ongelmien ratkaisemiseksi tehokkaasti rinnakkaisympäristössä on noudettava ohjelmointimenetelmiä, jotka soveltuvat erityisesti rinnakkaisympäristöön. Tässä työssä tarkastellaan rinnakkaisen laskennan perusteita, miten erilaiset ohjelmointimenetelmät vaikuttavat ohjelman suoriutumiseen grafiikkasuorittimella sekä miten voidaan sa…

Graphics processing unitnäytönohjaimetoptimointinäytönohjainparallel computingGPUrinnakkainen laskentaGrafiikkasuoritinCUDAohjelmointioptimization

researchProduct

A novel hardware accelerator for the HEVC intra prediction

2015

International audience; A novel hardware accelerator for the High Efficiency Video Coding (HEVC) intra prediction is presented in this paper in order to reduce the computation complexity within this standard and to accelerate the concerned calculations. We propose a new pipelined structure that we called Processing Element (PE) to execute all angular modes, and we repeat it in five paths that our architecture composed of. We present also another structure to carry out the Planar mode. This architecture supports all intra prediction modes for all prediction unit sizes. The synthesis results show that our design can run at 213 MHz for Xilinx Virtex 6 and is capable to process real time 120 10…

HEVC0209 industrial biotechnologyAdderVirtexComputer scienceProcessing element020208 electrical & electronic engineering1080pFPGAs02 engineering and technologyParallel computingIntra prediction[SPI]Engineering Sciences [physics]020901 industrial engineering & automationPlanar0202 electrical engineering electronic engineering information engineering[ SPI ] Engineering Sciences [physics]Hardware accelerationField-programmable gate arrayCoding (social sciences)

researchProduct

A HARDWARE SOLUTION FOR HEVC INTRA PREDICTION LOSSLESS CODING

2015

International audience; The lossless coding mode of the High Efficiency Video Coding (HEVC) main profile that bypasses transform, quantization, and in-loop filters is described. Compared to the HEVC non-lossless coding mode, the HEVC lossless coding mode provides perfect fidelity and an average bit-rate reduction of 3.2%–13.2%. It also significantly outperforms the existing lossless compression solutions, such as JPEG2000 and JPEG-LS for images as well as WinRAR for data archiving. A fully parallel-based solution is presented in this paper in order to reduce processing time and computation complexity resulting from intra prediction. Two higher performance structures are designed to perform …

HEVC[INFO.INFO-TI] Computer Science [cs]/Image Processing [eess.IV][ INFO.INFO-IM ] Computer Science [cs]/Medical Imaginglossless coding[INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV][ INFO.INFO-TI ] Computer Science [cs]/Image Processing[INFO.INFO-IM] Computer Science [cs]/Medical Imaging[INFO.INFO-IM]Computer Science [cs]/Medical Imagingparallel computing 1intra predictionFPGA

researchProduct