Search results for "Computer hardware"
showing 10 items of 378 documents
Run-time scalable NoC for FPGA based virtualized IPs
2017
The integration of virtualized FPGA-based hardware accelerators in a cloud computing is progressing from time to time. As the FPGA has limited resources, the dynamic partial reconfiguration capability of the FPGA is considered to share resources among different virtualized IPs during runtime. On the other hand, the NoC is a promising solution for communication among virtualized FPGA-based IPs. However, not all the virtualized regions of the FPGA will be active all the time. When there is no demand for virtualized IPs, the virtualized regions are loaded with blank bitstreams to save power. However, keeping active the idle components of the NoC connecting with the idle virtualized regions is …
Towards LST split-window algorithm FPGA implementation for CubeSats on-board computations purposes
2019
ABSTRACTNano, pico, and the so-called CubeSat satellites are taking place due to the emergent improvements in both high-performance nano and pico electronics and computational technologies. More th...
Efficient Parallel Sort on AVX-512-Based Multi-Core and Many-Core Architectures
2019
Sorting kernels are a fundamental part of numerous applications. The performance of sorting implementations is usually limited by a variety of factors such as computing power, memory bandwidth, and branch mispredictions. In this paper we propose an efficient hybrid sorting method which takes advantage of wide vector registers and the high bandwidth memory of modern AVX-512-based multi-core and many-core processors. Our approach employs a combination of vectorized bitonic sorting and load-balanced multi-threaded merging. Thread-level and data-level parallelism are used to exploit both compute power and memory bandwidth. Our single-threaded implementation is ~30x faster than qsort in the C st…
Online Scheduling of Task Graphs on Hybrid Platforms
2018
Modern computing platforms commonly include accelerators. We target the problem of scheduling applications modeled as task graphs on hybrid platforms made of two types of resources, such as CPUs and GPUs. We consider that task graphs are uncovered dynamically, and that the scheduler has information only on the available tasks, i.e., tasks whose predecessors have all been completed. Each task can be processed by either a CPU or a GPU, and the corresponding processing times are known. Our study extends a previous \(4\sqrt{m/k}\)-competitive online algorithm [2], where m is the number of CPUs and k the number of GPUs (\(m\ge k\)). We prove that no online algorithm can have a competitive ratio …
Multi-application Based Network-on-Chip Design for Mesh-of-Tree Topology Using Global Mapping and Reconfigurable Architecture
2019
This paper outlines a multi-application mapping for Mesh-of-Tree (MoT) topology based Network-on-Chip (NoC) design using reconfigurable architecture. A two phase Particle Swarm Optimization (PSO) has been proposed for reconfigurable architecture to minimize the communication cost. In first phase global mapping is done by combining multiple applications and in second phase, reconfiguration is achieved by switching the cores to near by routers using multiplexers. Experimentations have been carried out for several application benchmarks and synthetic applications generated using TGFF tool. The results show significant improvement in terms of communication cost after reconfiguration.
Torus Topology based Fault-Tolerant Network-on-Chip Design with Flexible Spare Core Placement
2018
The increase in the density of the IP cores being fabricated on a chip poses on-chip communication challenges and heat dissipation. To overcome these issues, Network-onChip (NoC) based communication architecture is introduced. In the nanoscale era NoCs are prone to faults which results in performance degradation and un-reliability. Hence efficient fault-tolerant methods are required to make the system reliable in contrast to diverse component failures. This paper presents a flexible spare core placement in torus topology based faulttolerant NoC design. The communications related to the failed core is taken care by selecting the best position for a spare core in the torus network. By conside…
Wireless NoC for Inter-FPGA Communication: Theoretical Case for Future Datacenters
2020
Integration of FPGAs in datacenters might have different motivations from acceleration to energy efficiency, but the goal of better performance tops all. FPGAs are being utilized in a variety of ways today, tightly coupled with heterogenous computing resources, and as a standalone network of homogenous resources. Open source software stacks, propriety tool chain, and programming languages with advanced methodologies are hitting hard on the programmability wall of the FPGAs. The deployment of FPGAs in datacenters will neither be sustainable nor economical, without realizing the multi-tenancy in multiple FPGAs. Inter-FPGA communication among multiple FPGAs remained relatively less addressed p…
A segmentation algorithm for noisy images
2005
International audience; This paper presents a segmentation algorithm for gray-level images and addresses issues related to its performance on noisy images. It formulates an image segmentation problem as a partition of a weighted image neighborhood hypergraph. To overcome the computational difficulty of directly solving this problem, a multilevel hypergraph partitioning has been used. To evaluate the algorithm, we have studied how noise affects the performance of the algorithm. The alpha-stable noise is considered and its effects on the algorithm are studied. Key words : graph, hypergraph, neighborhood hypergraph, multilevel hypergraph partitioning, image segmentation and noise removal.
On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization
2017
Abstract The growing interest to incorporate new features into mobile devices has increased the number of signal processing applications running over processors designed for mobile computing. A challenging signal processing field is acoustic source localization, which is attractive for applications such as automatic camera steering systems, human-machine interfaces, video gaming or audio surveillance. In this context, the emergence of systems-on-chip (SoC) that contain a small graphics accelerator (or GPU), contributes a notable increment of the computational capacity while partially retaining the appealing low-power consumption of embedded systems. This is the case, for example, of the Sam…
Robotic geometric and volumetric inspection of high value and large scale aircraft wings
2019
Increased demands in performance and production rates require a radical new approach to the design and manufacturing of aircraft wings. Performance of modern robotic manipulators has enabled research and development of fast automated non-destructive testing (NDT) systems for complex geometries. This paper presents recent outcomes of work aimed at removing the bottleneck due to data acquisition rates, to fully exploit the scanning speed of modern 6-DoF manipulators. The geometric assessment of the parts is carried out with a robotised dynamic laser scanner encoded through an absolute laser tracker. This method allows scanning speeds up to 330mm/s at 1mm pitch. State of the art ultrasonic ins…