Search results for "Parallel"
showing 10 items of 667 documents
Improving computation efficiency using input and architecture features for a virtual screening application
2023
Virtual screening is an early stage of the drug discovery process that selects the most promising candidates. In the urgent computing scenario it is critical to find a solution in a short time frame. In this paper, we focus on a real-world virtual screening application to evaluate out-of-kernel optimizations, that consider input and architecture features to improve the computation efficiency on GPU. Experiment results on a modern supercomputer node show that we can almost double the performance. Moreover, we implemented the optimization using SYCL and it provides a consistent benefit with the CUDA optimization. A virtual screening campaign can use this gain in performance to increase the nu…
On the systolic calculation of all-pairs interactions using transputer arrays
1991
Comparison of implementations of the lattice-Boltzmann method
2008
AbstractSimplicity of coding is usually an appealing feature of the lattice-Boltzmann method (LBM). Conventional implementations of LBM are often based on the two-lattice or the two-step algorithm, which however suffer from high memory consumption and poor computational performance, respectively. The aim of this work was to identify implementations of LBM that would achieve high computational performance with low memory consumption. Effects of memory addressing schemes were investigated in particular. Data layouts for velocity distribution values were also considered, and they were found to be related to computational performance. A novel bundle data layout was therefore introduced. Address…
Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance
2019
The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been developed taking into account high-throughput computing platforms equipped with expensive multichannel audio interfaces, the IoT paradigm is demanding the use of more flexible and energy-efficient systems. In this context, algorithms for source localization and ranging in wireless acoustic sensor networks can be considered an enabling technology for many IoT-based environments, including security, industrial, and health-care applications. This paper is aimed at evaluating impo…
SAUCE: A web application for interactive teaching and learning of parallel programming
2017
Abstract Prevalent hardware trends towards parallel architectures and algorithms create a growing demand for graduate students familiar with the programming of concurrent software. However, learning parallel programming is challenging due to complex communication and memory access patterns as well as the avoidance of common pitfalls such as dead-locks and race conditions. Hence, the learning process has to be supported by adequate software solutions in order to enable future computer scientists and engineers to write robust and efficient code. This paper discusses a selection of well-known parallel algorithms based on C++11 threads, OpenMP, MPI, and CUDA that can be interactively embedded i…
Domain-Knowledge Optimized Simulated Annealing for Network-on-Chip Application Mapping
2013
Network-on-Chip architectures are scalable on-chip interconnection networks. They replace the inefficient shared buses and are suitable for multicore and manycore systems. This paper presents an Optimized Simulated Annealing (OSA) algorithm for the Network-on-Chip application mapping problem. With OSA, the cores are implicitly and dynamically clustered using knowledge about communication demands. We show that OSA is a more feasible Simulated Annealing approach to NoC application mapping by comparing it with a general Simulated Annealing algorithm and a Branch and Bound algorithm, too. Using real applications we show that OSA is significantly faster than a general Simulated Annealing, withou…
"Table 24" of "Measurement of event shape and inclusive distributions at s**(1/2) = 130-GeV and 136-GeV."
1997
3-jet rate for the Jade Algorithm.
"Table 23" of "Measurement of event shape and inclusive distributions at s**(1/2) = 130-GeV and 136-GeV."
1997
2-jet rate for the Jade Algorithm.
"Table 25" of "Measurement of event shape and inclusive distributions at s**(1/2) = 130-GeV and 136-GeV."
1997
4-jet rate for the Jade Algorithm.
"Table 26" of "Measurement of event shape and inclusive distributions at s**(1/2) = 130-GeV and 136-GeV."
1997
5-jet rate for the Jade Algorithm.