Search results for " computing"
showing 10 items of 2075 documents
Efficient smart-camera accelerator: A configurable motion estimator dedicated to video codec
2013
Smart cameras are used in a large range of applications. Usually the smart cameras transmit the video or/and extracted information from the video scene, frequently on compressed format to fit with the application requirements. An efficient hardware accelerator that can be adapted and provide the required coding performances according to the events detected in the video, the available network bandwidth or user requirements, is therefore a key element for smart camera solutions. We propose in this paper to focus on a key part of the compression system: motion estimation. We have developed a flexible hardware implementation of the motion estimator based on FPGA component, fully compatible with…
Hardware Implementation of a Configurable Motion Estimator for Adjusting the Video Coding Performances
2012
International audience; Despite the diversity of video compression standard, the motion estimation still remains a key process which is used in most of them. Moreover, the required coding performances (bit-rate, PSNR, image spatial resolution, etc.) depend obviously of the application, the environment and the network communication. The motion estimation can therefore be adapted to fit with these performances. Meanwhile, the real time encoding is required in many applications. In order to reach this goal, we propose in this paper a hardware implementation of the motion estimator which enables the integer motion search algorithms to be modified and the fractional search and variable block siz…
Enhancing the Sniper Simulator with Thermal Measurement
2014
This paper presents the enhancement of the Sniper multicore / manycore simulator with thermal measurement possibilities using the HotSpot simulator. We present a plugin that interacts with Sniper to retrieve simulation data (integration areas and power consumptions) and calls HotSpot to compute the corresponding thermal results. The plugin also builds a two dimensional floorplan for the simulated microarchitecture. Furthermore we plan to integrate the simulation methodology presented here into an automatic design space exploration process using the multi-objective optimization tool called FADSE. Keywords—multicore; simulator; power consumption; thermal; HotSpot; Sniper
VLBI-resolution radio-map algorithms: Performance analysis of different levels of data-sharing on multi-socket, multi-core architectures
2012
a b s t r a c t A broad area in astronomy focuses on simulating extragalactic objects based on Very Long Baseline Interferometry (VLBI) radio-maps. Several algorithms in this scope simulate what would be the observed radio-maps if emitted from a predefined extragalactic object. This work analyzes the performance and scaling of this kind of algorithms on multi-socket, multi-core architectures. In particular, we evaluate a sharing approach, a privatizing approach and a hybrid approach on systems with complex memory hierarchy that includes shared Last Level Cache (LLC). In addition, we investigate which manual processes can be systematized and then automated in future works. The experiments sh…
Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors
2010
Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore Processors (HCoMs). These strategies are called Heterogeneous Process Distribution Strategy (HPS) and Heterogeneous Data Distribution Strategy (HDS). They are not novel and have been researched thoroughly. However, the advent of multicores necessitates enhancements to them. In this paper, we present these enhancements. Our study is based on experiments using six applications to perform Parallel Matrix-matrix Multiplication (PMM) on an HCoM employing the two distribution strategies.
Accelerating collision detection for large-scale crowd simulation on multi-core and many-core architectures
2013
The computing capabilities of current multi-core and many-core architectures have been used in crowd simulations for both enhancing crowd rendering and simulating continuum crowds. However, improving the scalability of crowd simulation systems by exploiting the inherent parallelism of these architectures is still an open issue. In this paper, we propose different parallelization strategies for the collision check procedure that takes place in agent-based simulations. These strategies are designed for exploiting the parallelism in both multi-core and many-core architectures like graphic processing units (GPUs). As for the many-core implementations, we analyse the bottlenecks of a previous G…
Suffix Array Construction on Multi-GPU Systems
2019
Suffix arrays are prevalent data structures being fundamental to a wide range of applications including bioinformatics, data compression, and information retrieval. Therefore, various algorithms for (parallel) suffix array construction both on CPUs and GPUs have been proposed over the years. Although providing significant speedup over their CPU-based counterparts, existing GPU implementations share a common disadvantage: input text sizes are limited by the scarce memory of a single GPU. In this paper, we overcome aforementioned memory limitations by exploiting multi-GPU nodes featuring fast NVLink interconnects. In order to achieve high performance for this communication-intensive task, we …
The Random Neural Network Model for the On-line Multicast Problem
2005
In this paper we propose the adoption of the Random Neural Network Model for the solution of the dynamic version of the Steiner Tree Problem in Networks (SPN). The Random Neural Network (RNN) is adopted as a heuristic capable of improving solutions achieved by previously proposed dynamic algorithms. We adapt the RNN model in order to map the network characteristics during a multicast transmission. The proposed methodology is validated by means of extensive experiments.
An Efficient Distributed Algorithm for Generating Multicast Distribution Trees
2005
Multicast transmission may use network resources more efficiently than multiple point-to-point messages; however, creating optimal multicast trees (Steiner Tree Problem in Networks) is prohibitively expensive. For this reason, heuristic methods are generally employed. Conventional centralized Steiner heuristics provide effective solutions, but they are unpractical for large networks, since they require complete knowledge of the network topology. This paper proposes a distributed algorithm for the heuristic solution of the Steiner Tree Problem. The algorithm allows the construction of effective distribution trees using a coordination protocol among the network nodes. The algorithm has been i…
Enabling Retransmissions for Achieving Reliable Multicast Communications in WSNs
2016
To ensure end-to-end reliable multicast or broadcast transmissions in IEEE 802.15.4 based wireless sensor networks WSNs) is a challenging task since no retransmission and acknowledgment mechanisms are defined in such WSNs. In this paper, we propose three retransmission enabled multicast transmission schemes in order to achieve reliable packet transmissions in such networks. Different from the legacy CSMA/CA principle, these schemes allow a sending or forwarding node to retransmit a packet if necessary and enable implicit or/and explicit acknowledgment for multicast services. Simulations are performed in order to assess the performance of these schemes in terms of number of retransmissions, …