Search results for "Cluster computing"
showing 10 items of 120 documents
Burrows Wheeler Transform on a Large Scale: Algorithms Implemented in Apache Spark
2021
With the rapid growth of Next Generation Sequencing (NGS) technologies, large amounts of "omics" data are daily collected and need to be processed. Indexing and compressing large sequences datasets are some of the most important tasks in this context. Here we propose algorithms for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our algorithms are the first ones that distribute the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources.
Concurrent Computing with Shared Replicated Memory
2019
The behavioural theory of concurrent systems states that any concurrent system can be captured by a behaviourally equivalent concurrent Abstract State Machine (cASM). While the theory in general assumes shared locations, it remains valid, if different agents can only interact via messages, i.e. sharing is restricted to mailboxes. There may even be a strict separation between memory managing agents and other agents that can only access the shared memory by sending query and update requests to the memory agents. This article is dedicated to an investigation of replicated data that is maintained by a memory management subsystem, whereas the replication neither appears in the requests nor in th…
Self-stabilizing Balls & Bins in Batches
2016
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modeled as static balls into bins processes, where $m$ balls (tasks) are to be distributed to $n$ bins (servers). In a seminal work, Azar et al. proposed the sequential strategy \greedy{d} for $n=m$. When thrown, a ball queries the load of $d$ random bins and is allocated to a least loaded of these. Azar et al. showed that $d=2$ yields an exponential improvement compared to $d=1$. Berenbrink et al. extended this to $m\gg n$, showing that the maximal load difference is independent of $m$ for $d=2$ (in contrast to $d=1$). W…
Investigating Low Level Protocols for Wireless Body Sensor Networks
2016
The rapid development of medical sensors has increased the interest in Wireless Body Area Network (WBAN) applications where physiological data from the human body and its environment is gathered, monitored, and analyzed to take the proper measures. In WBANs, it is essential to design MAC protocols that ensure adequate Quality of Service (QoS) such as low delay and high scalability. This paper investigates Medium Access Control (MAC) protocols used in WBAN, and compares their performance in a high traffic environment. Such scenario can be induced in case of emergency for example, where physiological data collected from all sensors on human body should be sent simultaneously to take appropria…
Efficient and accurate monitoring of the depth information in a Wireless Multimedia Sensor Network based surveillance
2017
International audience; Abstract—Wireless Multimedia Sensor Network (WMSN) is a promising technology capturing rich multimedia data like audio and video, which can be useful to monitor an environment under surveillance. However, many scenarios in real time monitoring requires 3D depth information. In this research work, we propose to use the disparity map that is computed from two or multiple images, in order to monitor the depth information in an object or event under surveillance using WMSN. Our system is based on distributed wireless sensors allowing us to notably reduce the computational time needed for 3D depth reconstruction, thus permitting the success of real time solutions. Each pa…
Unified System for Processing Real and Simulated Data in the ATLAS Experiment
2015
The physics goals of the next Large Hadron Collider run include high precision tests of the Standard Model and searches for new physics. These goals require detailed comparison of data with computational models simulating the expected data behavior. To highlight the role which modeling and simulation plays in future scientific discovery, we report on use cases and experience with a unified system built to process both real and simulated data of growing volume and variety.
Online shortest paths with confidence intervals for routing in a time varying random network
2018
International audience; The increase in the world's population and rising standards of living is leading to an ever-increasing number of vehicles on the roads, and with it ever-increasing difficulties in traffic management. This traffic management in transport networks can be clearly optimized by using information and communication technologies referred as Intelligent Transport Systems (ITS). This management problem is usually reformulated as finding the shortest path in a time varying random graph. In this article, an online shortest path computation using stochastic gradient descent is proposed. This routing algorithm for ITS traffic management is based on the online Frank-Wolfe approach.…
The IceProd framework: distributed data processing for the IceCube neutrino observatory
2015
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, identify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. This paper presents the first detailed description of IceProd, a lightweight distributed management system designed to meet these requirements. It is driven by a central database in order to manage mass production of simulations and analysis of data produced by the IceCube detector. IceProd runs as a separate layer on top of other middleware and can take advantage of a variety of c…
Real-time computation of parameter fitting and image reconstruction using graphical processing units
2016
Abstract In recent years graphical processing units (GPUs) have become a powerful tool in scientific computing. Their potential to speed up highly parallel applications brings the power of high performance computing to a wider range of users. However, programming these devices and integrating their use in existing applications is still a challenging task. In this paper we examined the potential of GPUs for two different applications. The first application, created at Paul Scherrer Institut (PSI), is used for parameter fitting during data analysis of μ SR (muon spin rotation, relaxation and resonance) experiments. The second application, developed at ETH, is used for PET (Positron Emission T…
P2P-PL: A pattern language to design efficient and robust peer-to-peer systems
2017
To design peer-to-peer (P2P) software systems is a challenging task, because of their highly decentralized nature, which may cause unexpected emergent global behaviors. The last fifteen years have seen many P2P applications to come out and win favor with millions of users. From success histories of applications like BitTorrent, Skype, MyP2P we have learnt a number of useful design patterns. Thus, in this article we present a P2P pattern language (shortly, P2P-PL) which encompasses all the aspects that a fully effective and efficient P2P software system should provide, namely consistency of stored data, redundancy, load balancing, coping with asymmetric bandwidth, decentralized security. The…