Search results for "Core"
showing 10 items of 1999 documents
Enhancing the Sniper Simulator with Thermal Measurement
2014
This paper presents the enhancement of the Sniper multicore / manycore simulator with thermal measurement possibilities using the HotSpot simulator. We present a plugin that interacts with Sniper to retrieve simulation data (integration areas and power consumptions) and calls HotSpot to compute the corresponding thermal results. The plugin also builds a two dimensional floorplan for the simulated microarchitecture. Furthermore we plan to integrate the simulation methodology presented here into an automatic design space exploration process using the multi-objective optimization tool called FADSE. Keywords—multicore; simulator; power consumption; thermal; HotSpot; Sniper
Multithreaded Translation of Ptolemy II Designs on Multicore Platforms
2008
Ptolemy II is an open source environment for system design and test based on component data flow. This paradigm tries to make parallel systems more deterministic and understandable. In this work we propose a technique to translate designs developed with Ptolemy II, into multithreaded Java implementations on multicore platforms. We have chosen Java mainly because Ptolemy II is implemented in Java and then we get direct code reuse. The counterpart is a certain amount of overhead that we expect to be less relevant as Java runtime environment will evolve. The main goals are to produce efficient parallel simulators and software devices with competitive performance level. We show by means of an e…
VLBI-resolution radio-map algorithms: Performance analysis of different levels of data-sharing on multi-socket, multi-core architectures
2012
a b s t r a c t A broad area in astronomy focuses on simulating extragalactic objects based on Very Long Baseline Interferometry (VLBI) radio-maps. Several algorithms in this scope simulate what would be the observed radio-maps if emitted from a predefined extragalactic object. This work analyzes the performance and scaling of this kind of algorithms on multi-socket, multi-core architectures. In particular, we evaluate a sharing approach, a privatizing approach and a hybrid approach on systems with complex memory hierarchy that includes shared Last Level Cache (LLC). In addition, we investigate which manual processes can be systematized and then automated in future works. The experiments sh…
Multicore optical fibres for astrophotonics
2011
We report progress towards multimode (MM) fibre filters for suppressing the OH emission that hinders ground-based observation of the early Universe. Fibre Bragg gratings (FBGs) can filter these narrow spectral lines in single-mode (SM) fibres [1]. Implementing them in MM fibres well-matched to astronomical instruments requires transitions between the MM fibre and several SM fibres [2]. Such hand-crafted “photonic lanterns” require many identical FBGs to be made and spliced in place. Instead we are pursuing the idea in multicore (MC) fibres, Fig. 1(a). The FBG is written at once in all the SM cores. The fibre is jacketed with low-index glass and tapered to form the core and cladding of a MM …
Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors
2010
Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore Processors (HCoMs). These strategies are called Heterogeneous Process Distribution Strategy (HPS) and Heterogeneous Data Distribution Strategy (HDS). They are not novel and have been researched thoroughly. However, the advent of multicores necessitates enhancements to them. In this paper, we present these enhancements. Our study is based on experiments using six applications to perform Parallel Matrix-matrix Multiplication (PMM) on an HCoM employing the two distribution strategies.
Accelerating collision detection for large-scale crowd simulation on multi-core and many-core architectures
2013
The computing capabilities of current multi-core and many-core architectures have been used in crowd simulations for both enhancing crowd rendering and simulating continuum crowds. However, improving the scalability of crowd simulation systems by exploiting the inherent parallelism of these architectures is still an open issue. In this paper, we propose different parallelization strategies for the collision check procedure that takes place in agent-based simulations. These strategies are designed for exploiting the parallelism in both multi-core and many-core architectures like graphic processing units (GPUs). As for the many-core implementations, we analyse the bottlenecks of a previous G…
Suffix Array Construction on Multi-GPU Systems
2019
Suffix arrays are prevalent data structures being fundamental to a wide range of applications including bioinformatics, data compression, and information retrieval. Therefore, various algorithms for (parallel) suffix array construction both on CPUs and GPUs have been proposed over the years. Although providing significant speedup over their CPU-based counterparts, existing GPU implementations share a common disadvantage: input text sizes are limited by the scarce memory of a single GPU. In this paper, we overcome aforementioned memory limitations by exploiting multi-GPU nodes featuring fast NVLink interconnects. In order to achieve high performance for this communication-intensive task, we …
Flexible VLIW processor based on FPGA for real-time image processing
2011
Modern FPGA chips, with their larger memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high density FPGAs it is now possible to implement a high performance Very Long Instruction Word (VLIW) processor core in an FPGA. With VLIW architecture, the processor effectiveness depends on the ability of compilers to provide sufficient Instruction Level Parallelism (ILP) from program code. This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors in order to shorten the developm…
Design and Implementation of a Low-cost Embedded Iris Recognition System on a Dual-core Processor Platform
2012
Abstract Design of a low-cost embedded iris recognition system is described in this paper. Firstly, we develop a simple and effective iris image acquisition unit, which is cheap and easy to use. This is achieved by both of hardware design and image evaluation algorithm development. Secondly, the iris recognition algorithm is introduced, including iris segmentation, image normalization, feature extraction, and code matching. The algorithm implementation architecture is based on an embedded dual-core processor platform, Texas Instruments TMS320DM6446 evaluation module (Davinci), which contains an ARM core and a DSP core in one chip. Thirdly, the evaluation experiments are performed on the est…
VIBPACK: A package to treat multidimensional electron-vibrational molecular problems with application to magnetic and optical properties
2018
We present a FORTRAN code based on a new powerful and efficient computational approach to solve multidimensional dynamic Jahn-Teller and pseudo Jahn-Teller problems. This symmetry-assisted approach constituting a theoretical core of the program is based on the full exploration of the point symmetry of the electronic and vibrational states. We also report some selected examples of increasing complexity aimed to display the theoretical background as well as the advantages and capabilities of the program to evaluate of the energy pattern, magnetic and optical properties of large multimode vibronic systems. © 2018 Wiley Periodicals, Inc.