Search results for "Massively parallel"
showing 10 items of 23 documents
Massively parallel computation of atmospheric neutrino oscillations on CUDA-enabled accelerators
2019
Abstract The computation of neutrino flavor transition amplitudes through inhomogeneous matter is a time-consuming step and thus could benefit from optimization and parallelization. Next to reliable parameter estimation of intrinsic physical quantities such as neutrino masses and mixing angles, these transition amplitudes are important in hypothesis testing of potential extensions of the standard model of elementary particle physics, such as additional neutrino flavors. Hence, fast yet precise implementations are of high importance to research. In the recent past, massively parallel accelerators such as CUDA-enabled GPUs featuring thousands of compute units have been widely adopted due to t…
Live demonstration: multiplexing AER asynchronous channels over LVDS Links with Flow-Control and Clock-Correction for Scalable Neuromorphic Systems
2017
Paper presented at the 2017 IEEE International Symposium on Circuits and Systems (ISCAS), held in Baltimore, MD, USA, on 28-31 May 2017.
Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations of the 2D Ising Model
2010
A Modern Graphics Processing unit (GPU) is able to perform massively parallel scientific computations at low cost. We extend our implementation of the checkerboard algorithm for the two-dimensional Ising model [T. Preis et al., Journal of Chemical Physics 228 (2009) 4468–4477] in order to overcome the memory limitations of a single GPU which enables us to simulate significantly larger systems. Using multi-spin coding techniques, we are able to accelerate simulations on a single GPU by factors up to 35 compared to an optimized single Central Processor Unit (CPU) core implementation which employs multi-spin coding. By combining the Compute Unified Device Architecture (CUDA) with the Message P…
LARGE-SCALE SIMULATIONS IN CONDENSED MATTER PHYSICS —THE NEED FOR A TERAFLOP COMPUTER
1992
The introduction of vector processors {“supercomputers” with a performance in the range of 109 floating point operations (1 GFLOP) per second} has had an enormous impact on computational condensed matter physics. The possibility of a substantially enhanced performance by massively parallel processors (“teraflop” machines with 1012 floating point operations per second) will allow satisfactory treatment of a large range of important scientific problems which have to a great extent thus far escaped numerical resolution. The present paper describes only a few examples (out of a long list of interesting research problems!) for which the availability of “teraflops” will allow spectacular progres…
Accelerating bioinformatics applications via emerging parallel computing systems [Guest editorial]
2015
The papers in this issue focus on advanced parallel computing systems for bioinformatics applications. This papers provide a forum to publish recent advances in the improvement of handling bioinformatics problems on emerging parallel computing systems. These systems can be characterized by exploiting different types of parallelism, including fine-grained versus coarse-grained and thread-level parallelism versus datalevel parallelism versus request-level parallelism. Hence, parallel computing systems based on multi- and many-core CPUs, many-core GPUs, vector processors, or FPGAs offer the promise to massively accelerate many bioinformatics algorithms and applications, ranging from computeint…
SIMULATING SPIN MODELS ON GPU: A TOUR
2012
The use of graphics processing units (GPUs) in scientific computing has gathered considerable momentum in the past five years. While GPUs in general promise high performance and excellent performance per Watt ratios, not every class of problems is equally well suitable for exploiting the massively parallel architecture they provide. Lattice spin models appear to be prototypic examples of problems suitable for this architecture, at least as long as local update algorithms are employed. In this review, I summarize our recent experience with the simulation of a wide range of spin models on GPU employing an equally wide range of update algorithms, ranging from Metropolis and heat bath updates,…
Architecture-Driven Level Set Optimization: From Clustering to Sub-pixel Image Segmentation
2016
Thanks to their effectiveness, active contour models (ACMs) are of great interest for computer vision scientists. The level set methods (LSMs) refer to the class of geometric active contours. Comparing with the other ACMs, in addition to subpixel accuracy, it has the intrinsic ability to automatically handle topological changes. Nevertheless, the LSMs are computationally expensive. A solution for their time consumption problem can be hardware acceleration using some massively parallel devices such as graphics processing units (GPUs). But the question is: which accuracy can we reach while still maintaining an adequate algorithm to massively parallel architecture? In this paper, we attempt to…
Noncovalent force spectroscopy using wide-field optical and diamond-based magnetic imaging
2019
A realization of the force-induced remnant magnetization spectroscopy (FIRMS) technique of specific biomolecular binding is presented where detection is accomplished with wide-field optical and diamond-based magnetometry using an ensemble of nitrogen-vacancy (NV) color centers. The technique may be adapted for massively parallel screening of arrays of nanoscale samples.
Motion analysis using the novelty filter
1991
Abstract An original approach to the motion analysis, based on the novelty filter, is proposed. The novelty filter stresses the novelties occurring in a pattern representing an image of the scene under consideration with respect to patterns representing previous images of the same scene, so that visual information about the motion of the objects is obtained. The novelty filter may be implemented by a neural network architecture, taking advantage of the capabilities of massive parallelism, adaptive learning and noise robustness. The novelty filter may learn the entire trajectory of an object, through an incremental learning of a sequence of images capturing the scene, thus emphasizing if the…
Investigation of protein folding by coarse-grained molecular dynamics with the UNRES force field.
2010
Coarse-grained molecular dynamics simulations offer a dramatic extension of the time-scale of simulations compared to all-atom approaches. In this article, we describe the use of the physics-based united-residue (UNRES) force field, developed in our laboratory, in protein-structure simulations. We demonstrate that this force field offers about a 4000-times extension of the simulation time scale; this feature arises both from averaging out the fast-moving degrees of freedom and reduction of the cost of energy and force calculations compared to all-atom approaches with explicit solvent. With massively parallel computers, microsecond folding simulation times of proteins containing about 1000 r…