Search results for "ComputerSystemsOrganization_PROCESSORARCHITECTURES"
showing 10 items of 12 documents
One-dimensional hydrodynamic modeling of coronal plasmas on transputer arrays
1990
Abstract We describe a concurrent implementation of the Palermo-Harvard hydrodynamic code on cost-effective and modularity expandable transputer arrays. We have tested the effectiveness of our approach by simulating an already well-studied compact solar-flare model on different transputer configurations and compared their performances with those of other machines. We have found that the speed of the concurrent program on a 16-T800 transputers array is ~1/9 of that of the equivalent code optimized for a CRAY X-MP/48. This work clearly shows that transputer-based arrays provide locally available high computing-power tools to extend the investigation of compact solar flares and similar astroph…
NoC Reconfiguration for CMP Virtualization
2011
At NoC level, the traffic interferences can be drastically reduced by using virtualization mechanisms. An effective strategy to virtualize a NoC consists in dividing the network in different partitions, each one serving different applications and traffic flows. In this paper, we propose a NoC reconfiguration mechanism to support NoC virtualization under real scenarios. Dynamic reassignment of network resources to different partitions is allowed in order to NoC dynamically adapts to application needs. Evaluation results show a good behavior of CMP virtualization.
Design methods of multithreaded architectures for multicore microcontrollers
2011
The development of electronic technology today has allowed the implementation of complex architectures, which led to the emergence of multicore processors technology. Multicore architectures are built from superscalar and multithreaded processors. Integrating new technologies in embedded applications requires the development of multicore processors that can be integrated into a smaller area like a classic microcontroller. These processors must manage fewer resources and be able to manage multiple tasks simultaneously. In this paper we present a method of modeling, simulation and evaluation of two multithreaded architectures with limited resources, which could be integrated into embedded sys…
Sparsity-Driven Digital Terrain Model Extraction
2020
We here introduce an automatic Digital Terrain Model (DTM) extraction method. The proposed sparsity-driven DTM extractor (SD-DTM) takes a high-resolution Digital Surface Model (DSM) as an input and constructs a high-resolution DTM using the variational framework. To obtain an accurate DTM, an iterative approach is proposed for the minimization of the target variational cost function. Accuracy of the SD-DTM is shown in a real-world DSM data set. We show the efficiency and effectiveness of the approach both visually and quantitatively via residual plots in illustrative terrain types.
Pairwise DNA Sequence Alignment Optimization
2015
This chapter presents a parallel implementation of the Smith-Waterman algorithm to accelerate the pairwise alignment of DNA sequences. This algorithm is especially computationally demanding for long DNA sequences. Parallelization approaches are examined in order to deeply explore the inherent parallelism within Intel Xeon Phi coprocessors. This chapter looks at exploiting instruction-level parallelism within 512-bit single instruction multiple data instructions (vectorization) as well as thread-level parallelism over the many cores (multithreading using OpenMP). Between coprocessors, device-level parallelism through the compute power of clusters including Intel Xeon Phi coprocessors using M…
Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures
2015
In this paper we present new parallelization techniques for searching large-scale biological sequence databases with the Smith-Waterman algorithm on Xeon Phi-based neoheterogenous architectures. In order to make full use of the compute power of both the multi-core CPU and the many-core Xeon Phi hardware, we use a collaborative computing scheme as well as hybrid parallelism. At the CPU side, we employ SSE intrinsics and multi-threading to implement SIMD parallelism. At the Xeon Phi side, we use Knights Corner vector instructions to gain more data parallelism. We have presented two dynamic task distribution schemes (thread level and device level) in order to achieve better load balancing. Fur…
Numerical experiments with a parallel fast direct elliptic solver on Cray T3E
1997
A parallel fast direct O(N log N) solver is shortly described for linear systems with separable block tridiagonal matrices. A good parallel scalability of the proposed method is demonstrated on a Cray T3E parallel computer using MPI in communication. Also, the sequential performance is compared with the well-known BLKTRI-implementation of the generalized. cyclic reduction method using a single processor of Cray T3E.
"Table 10" of "Properties of hadronic Z decays and test of QCD generators"
1992
Thrust distribution.
"Table 7" of "Studies of QCD in e+ e- --> hadrons at E(cm) = 130-GeV and 136-GeV."
2000
Distribution of Thrust.
"Table 9" of "Properties of hadronic Z decays and test of QCD generators"
1992
Thrust distribution.