Search results for "Parallel process"
showing 10 items of 34 documents
Parallel macro pipelining on the intel SCC many-core computer
2013
In this paper we present how Intel's Single-Chip-Cloud processor behaves for parallel macro pipeline applications. Subsets of the SCC's available cores can be arranged as a pipeline where each core processes one stage of the overall workload. Each of the independent cores processes a small part of a larger task and feeds the following core with new data after it finishes its work. Our case-study is a parallel rendering system which renders successive images and applies different filters on them. On normal graphics adapters this is usually done in multiple cycles, we do this in a single pipeline pass. We show that we can achieve a significant speedup by using multiple parallel pipelines on t…
PGAC: A Parallel Genetic Algorithm for Data Clustering
2005
Cluster analysis is a valuable tool for exploratory pattern analysis, especially when very little a priori knowledge about the data is available. Distributed systems, based on high speed intranet connections, provide new tools in order to design new and faster clustering algorithms. Here, a parallel genetic algorithm for clustering called PGAC is described. The used strategy of parallelization is the island model paradigm where different populations of chromosomes (called demes) evolve locally to each processor and from time to time some individuals are moved from one deme to another. Experiments have been performed for testing the benefits of the parallelisation paradigm in terms of comput…
Pure Functions in C: A Small Keyword for Automatic Parallelization
2017
AbstractThe need for parallel task execution has been steadily growing in recent years since manufacturers mainly improve processor performance by increasing the number of installed cores instead of scaling the processor’s frequency. To make use of this potential, an essential technique to increase the parallelism of a program is to parallelize loops. Several automatic loop nest parallelizers have been developed in the past such as PluTo. The main restriction of these tools is that the loops must be statically analyzable which, among other things, disallows function calls within the loops. In this article, we present a seemingly simple extension to the C programming language which marks fun…
Louis Jacques Thenard's Chemistry Courses at the Collège de France, 1804–1835
2010
This article is concerned with the public courses and lecture demonstrations given by Louis Jacques Thenard at the College de France during the first decades of the nineteenth century. The expectations and needs of Thenard's auditors will be studied in order to understand the role played by chemistry courses at the College in the context of the growing and changing Parisian teaching market during the first third of the nineteenth century. The preparation and performance of lecture demonstrations was the main driving force of several major changes in the premises and the personnel associated with the chair of chemistry. Our analysis of the parallel process of expansion and functional differe…
Parallel Simulated Annealing: Getting Super Linear Speedups
2005
The study described in this paper tries to improve and combine different approaches that are able to speed up applications of the Simulated Annealing model. It investigates separately two main aspects concerning the degree of parallelism an implementation can egectively exploit at the initial andfinal periods of an execution. As for case studies, it deals with two implementations: the Job shop Scheduling problem and the poryblio selection problem. The paper reports the results of a large number of experiments, carried out by means of a transputer network and a hypercube system. They give useful suggestions about selecting the most suitable values of the intervention parameters to achieve su…
One-dimensional hydrodynamic modeling of coronal plasmas on transputer arrays
1990
Abstract We describe a concurrent implementation of the Palermo-Harvard hydrodynamic code on cost-effective and modularity expandable transputer arrays. We have tested the effectiveness of our approach by simulating an already well-studied compact solar-flare model on different transputer configurations and compared their performances with those of other machines. We have found that the speed of the concurrent program on a 16-T800 transputers array is ~1/9 of that of the equivalent code optimized for a CRAY X-MP/48. This work clearly shows that transputer-based arrays provide locally available high computing-power tools to extend the investigation of compact solar flares and similar astroph…
Multiple modular very long instruction word processors based on field programmable gate arrays
2007
Modern field programmable gate array (FPGA) chips, with their large memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high-density FPGAs, it is now possible to implement a high-performance very long instruction word (VLIW) processor core in an FPGA. This paper describes research results about enabling the DSP TMS320 C6201 model for real-time image processing applications by exploiting FPGA technology. We present a modular DSP C6201 VHDL model with a variable instruction set. We call this new development a minimum mandatory modules (M3) approach. Our goals are to keep the flexibility of DSP in order to shor…
Experimental Study of Six Different Implementations of Parallel Matrix Multiplication on Heterogeneous Computational Clusters of Multicore Processors
2010
Two strategies of distribution of computations can be used to implement parallel solvers for dense linear algebra problems for Heterogeneous Computational Clusters of Multicore Processors (HCoMs). These strategies are called Heterogeneous Process Distribution Strategy (HPS) and Heterogeneous Data Distribution Strategy (HDS). They are not novel and have been researched thoroughly. However, the advent of multicores necessitates enhancements to them. In this paper, we present these enhancements. Our study is based on experiments using six applications to perform Parallel Matrix-matrix Multiplication (PMM) on an HCoM employing the two distribution strategies.
Elementary transformation analysis for Array-OL
2009
Array-OL is a high-level specification language dedicated to the definition of multidimentional intensive signal processing applications. It allows to specify both the task parallelism and the data parallelism of these applications on focusing on their complex multidimensional data access patterns. Several tools exist for implementing an Array-OL specification as a data parallel program. While Array-OL can be used directly, it is often convenient to be able to deduce part of the specification from a sequential version of the application. This paper proposes such an analysis and examines its feasibility and its limits.
The veridical perception of object temperature with varying skin temperature.
1988
The effect of skin-adaptation temperature on object-temperature perception was investigated, using the method of dichiric matching, in an attempt to determine whether veridical perception of physical object temperature occurs in human subjects. Observers were presented with a test temperature on one hand and required to find a matching temperature, that is, one that produced the same sensation, on the other, differently adapted, hand. Using equality of test and matching temperatures as a criterion of veridical perception, it was found that the latter improves with ΔT, the difference between object temperature and skin-adaptation temperature. It is postulated that when ΔT is close to zero, v…