Search results for "Supercomputer"
showing 10 items of 45 documents
Improving Collective I/O Performance Using Non-volatile Memory Devices
2016
Collective I/O is a parallel I/O technique designed to deliver high performance data access to scientific applications running on high-end computing clusters. In collective I/O, write performance is highly dependent upon the storage system response time and limited by the slowest writer. The storage system response time in conjunction with the need for global synchronisation, required during every round of data exchange and write, severely impacts collective I/O performance. Future Exascale systems will have an increasing number of processor cores, while the number of storage servers will remain relatively small. Therefore, the storage system concurrency level will further increase, worseni…
XLCS: A New Bit-Parallel Longest Common Subsequence Algorithm on Xeon Phi Clusters
2019
Finding the longest common subsequence (LCS) of two strings is a classical problem in bioinformatics. A basic approach to solve this problem is based on dynamic programming. As the biological sequence databases are growing continuously, bit-parallel sequence comparison algorithms are becoming increasingly important. In this paper, we present XLCS, a new parallel implementation to accelerate the LCS algorithm on Xeon Phi clusters by performing bit-wise operations. We have designed an asynchronous IO framework to improve the data transfer efficiency. To make full use of the computing resources of Xeon Phi clusters, we use three levels of parallelism: node-level, thread-level and vector-level.…
A recurrence-free variant of strassen’s algorithm on hypercube
1995
In this paper a non-recursive Strassen’s matrix multiplication algorithm is presented. This new algorithm is suitable to run on parallel environments. Two computational schemes have been worked out exploiting different parallel approaches on hypercube architecture. A comparative analysis is reported. The experiments have been carried out on an nCUBE-2 supercomputer, housed at CNUCE in Pisa, supporting the Express parallel operating system. © 1995, Taylor & Francis Group, LLC. All rights reserved.
Millimeter-Scale and Billion-Atom Reactive Force Field Simulation on Sunway Taihulight
2020
Large-scale molecular dynamics (MD) simulations on supercomputers play an increasingly important role in many research areas. With the capability of simulating charge equilibration (QEq), bonds and so on, Reactive force field (ReaxFF) enables the precise simulation of chemical reactions. Compared to the first principle molecular dynamics (FPMD), ReaxFF has far lower requirements on computational resources so that it can achieve higher efficiencies for large-scale simulations. In this article, we present our efforts on scaling ReaxFF on the Sunway TaihuLight Supercomputer (TaihuLight). We have carefully redesigned the force analysis and neighbor list building steps. By applying fine-grained …
UPC++ for bioinformatics: A case study using genome-wide association studies
2014
Modern genotyping technologies are able to obtain up to a few million genetic markers (such as SNPs) of an individual within a few minutes of time. Detecting epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important but time-consuming operation since statistical computations have to be performed for each pair of measured markers. Therefore, a variety of HPC architectures have been used to accelerate these studies. In this work we present a parallel approach for multi-core clusters, which is implemented with UPC++ and takes advantage of the features available in the Partitioned Global Address Space and Object Oriented Programming models. Our solution is base…
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
2014
This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Computer Science. The final authenticated version is available online at: https://doi.org/10.1007/978-3-319-09873-9_57 [Abstract] High-throughput genotyping technologies allow the collection of up to a few million genetic markers (such as SNPs) of an individual within a few minutes of time. Detecting epistasis, such as 2-SNP interactions, in Genome-Wide Association Studies is an important but time consuming operation since statistical computations have to be performed for each pair of measured markers. In this work we present EpistSearch, a parallelized tool that, following the log-linear model appr…
Lattice quantum hadrodynamics on a CRAY Y-MP
1992
Quantum corrections to the mean-field equation of state for nuclear matter are estimated in a lattice simulation of quantum hadrodynamics on a CRAY Y-MP. In contrast with lattice quantum chromodynamics, where coordinate space methods are the standard, the calculations are carried out in momentum space and on nonhypercubic (irregular) lattices. The quantum corrections to the known, mean-field equation of state were found to be considerable. The time frame of the project and the large computational needs of the program required the use of powerful supercomputers, like the CRAY Y-MP, which are capable of performing at a very high computing speed by using both vector and parallel hardware, the …
PROLISEAN: A New Security Protocol for Programmable Matter
2021
The vision for programmable matter is to create a material that can be reprogrammed to have different shapes and to change its physical properties on demand. They are autonomous systems composed of a huge number of independent connected elements called particles. The connections to one another form the overall shape of the system. These particles are capable of interacting with each other and take decisions based on their environment. Beyond sensing, processing, and communication capabilities, programmable matter includes actuation and motion capabilities. It could be deployed in different domains and will constitute an intelligent component of the IoT. A lot of applications can derive fro…
Faster GPU-Accelerated Smith-Waterman Algorithm with Alignment Backtracking for Short DNA Sequences
2014
In this paper, we present a GPU-accelerated Smith-Waterman (SW) algorithm with Alignment Backtracking, called GSWAB, for short DNA sequences. This algorithm performs all-to-all pairwise alignments and retrieves optimal local alignments on CUDA-enabled GPUs. To facilitate fast alignment backtracking, we have investigated a tile-based SW implementation using the CUDA programming model. This tiled computing pattern enables us to more deeply explore the powerful compute capability of GPUs. We have evaluated the performance of GSWAB on a Kepler-based GeForce GTX Titan graphics card. The results show that GSWAB can achieve a performance of up to 56.8 GCUPS on large-scale datasets. Furthermore, ou…
Reconstruction of Low Energy Neutrino Events with GPUs at IceCube
2020
IceCube is a cubic kilometer neutrino observatory located at the South Pole that produces massive amounts of data by measuring individual Cherenkov photons from neutrino interaction events in the energy range from few GeV to several PeV. The actual reconstruction of neutrino events in the GeV range is computationally challenging due to the scarcity of data produced by single events. This can lead to run times of several weeks for the state-of-the-art reconstruction method – Pegleg – on CPUs for typical workloads of many ten-thousand events. We propose a GPU version of Pegleg that probes the likelihood space with several hypotheses in parallel while adapting the amount of parallel sampled hy…