Search results for "Shared memory"
showing 6 items of 26 documents
PenRed: An extensible and parallel Monte-Carlo framework for radiation transport based on PENELOPE
2021
Monte Carlo methods provide detailed and accurate results for radiation transport simulations. Unfortunately, the high computational cost of these methods limits its usage in real-time applications. Moreover, existing computer codes do not provide a methodology for adapting these kind of simulations to specific problems without advanced knowledge of the corresponding code system, and this restricts their applicability. To help solve these current limitations, we present PenRed, a general-purpose, stand-alone, extensible and modular framework code based on PENELOPE for parallel Monte Carlo simulations of electron-photon transport through matter. It has been implemented in C++ programming lan…
Axially deformed solution of the Skyrme-Hartree-Fock-Bogolyubov equations using the transformed harmonic oscillator basis (II) HFBTHO v2.00d: a new v…
2012
We describe the new version 2.00d of the code HFBTHO that solves the nuclear Skyrme Hartree-Fock (HF) or Skyrme Hartree-Fock-Bogolyubov (HFB) problem by using the cylindrical transformed deformed harmonic-oscillator basis. In the new version, we have implemented the following features: (i) the modified Broyden method for non-linear problems, (ii) optional breaking of reflection symmetry, (iii) calculation of axial multipole moments, (iv) finite temperature formalism for the HFB method, (v) linear constraint method based on the approximation of the Random Phase Approximation (RPA) matrix for multi-constraint calculations, (vi) blocking of quasi-particles in the Equal Filling Approximation (E…
A Parallel Implementation of the Tree-Structured Self-Organizing Map
2002
This paper presents how Self-Organizing Maps (SOMs)can be trained efficiently using several, simultaneously executing threads on a shared memory Symmetric MultiProcessing (SMP)computer. The training method is a batch version of the Tree-Structured Self-Organizing Map. We note that SMP type of parallel training is very useful for large data sets obtained from nature, the process industry or large document collections, since we do not encounter similar model size limitations as with hardware SOM implementations.
Implementing Immersive Clustering with VR Juggler
2005
Continuous, rapid improvements in commodity hardware have allowed users of immersive visualization to employ high-quality graphics hardware, high-speed processors, and significant amounts of memory for much lower costs than would be possible with high-end, shared memory computers traditionally used for such purposes. Mimicking the features of a single shared memory computer requires that the commodity computers act in concert—namely, as a tightly synchronized cluster. In this paper, we describe the clustering infrastructure of VR Juggler that enables the use of distributed and clustered computers for the display of immersive virtual environments. We discuss each of the potential ways to syn…
A Lightweight Software Architecture for Robot Navigation and Visual Logging through Environmental Landmarks Recognition
2006
A robot architecture with real-time performance in navigation tasks is presented. The system architecture is multi-threaded with shared memory and fast message passing through static signalling. In this paper, we focused on the reactive layer components and its straightforward implementation. The proposed architecture is described with reference to an experimental setup, in which the robot task is visual logging of environmental landmarks detected on the basis of sensor readings. Our experimental results show how the robot is able to identify, make snapshots and log a set of landmarks by matching 2D geometric patterns.
Parallel Collision Queries on the GPU
2013
We present parallel algorithms to accelerate collision tests of rigid body objects for a high number of independent transformations as they occur in sampling-based motion planning and path validation problems. We compare various GPU approaches with a different level of parallelism against each other and against a parallel CPU implementation. Our algorithms require no sophisticated load balancing schemes. They make no assumption on the distribution of the input transformations and require no pre-processing. Yet, we can perform up to 1 million collision tests per second with our best GPU implementation in our benchmarks. This is about 2.5X faster than our reference multi-core CPU implementati…