Search results for "distributed"
showing 10 items of 1260 documents
DESDEO : An Open Framework for Interactive Multiobjective Optimization
2018
We introduce a framework for interactive multiobjective optimization methods called DESDEO released under an open source license. With the framework, we want to make interactive methods easily accessible to be applied in solving real-world problems. The framework follows an object-oriented software design paradigm, where functionalities have been divided to modular, self-contained components. The framework contains implementations of some interactive methods, but also components which can be utilized to implement more interactive methods and, thus, increase the applicability of the framework. To demonstrate how the framework can be used, we consider an example problem where the pollution of…
Subjective Logic-Based In-Network Data Processing for Trust Management in Collocated and Distributed Wireless Sensor Networks
2018
While analyzing an explosive amount of data collected in today’s wireless sensor networks (WSNs), the redundant information in the sensed data needs to be handled. In-network data processing is a technique which can eliminate or reduce such redundancy, leading to minimized resource consumption. On the other hand, trust management techniques establish trust relationships among nodes and detect unreliable nodes. In this paper, we propose two novel in-network data processing schemes for trust management in static WSNs. The first scheme targets at networks, where sensor nodes are closely collocated to report the same event. Considering both spatial and temporal correlations, this scheme generat…
Ensuring the Reliability of an Autonomous Vehicle
2017
International audience; In automotive applications, several components, offering different services, can be composed in order to handle one specific task (autonomous driving for example). Nevertheless, component composition is not straightforward and is subject to the occurrence ofbugs resulting from components or services incompatibilities for instance. Hence, bugs detection in component-based systems at thedesign level is very important, particularly, when the developed system concerns automotive applications supporting critical services.In this paper, we propose a formal approach for modeling and verifying the reliability of an autonomous vehicle system, communicatingcontinuously with of…
Deduplication Potential of HPC Applications’ Checkpoints
2016
HPC systems contain an increasing number of components, decreasing the mean time between failures. Checkpoint mechanisms help to overcome such failures for long-running applications. A viable solution to remove the resulting pressure from the I/O backends is to deduplicate the checkpoints. However, there is little knowledge about the potential to save I/Os for HPC applications by using deduplication within the checkpointing process. In this paper, we perform a broad study about the deduplication behavior of HPC application checkpointing and its impact on system design.
HPG pore: an efficient and scalable framework for nanopore sequencing data.
2016
The use of nanopore technologies is expected to spread in the future because they are portable and can sequence long fragments of DNA molecules without prior amplification. The first nanopore sequencer available, the MinION™ from Oxford Nanopore Technologies, is a USB-connected, portable device that allows real-time DNA analysis. In addition, other new instruments are expected to be released soon, which promise to outperform the current short-read technologies in terms of throughput. Despite the flood of data expected from this technology, the data analysis solutions currently available are only designed to manage small projects and are not scalable. Here we present HPG Pore, a toolkit for …
Next-generation sequencing: big data meets high performance computing
2017
The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and t…
A new parallel pipeline for DNA methylation analysis of long reads datasets
2017
Background DNA methylation is an important mechanism of epigenetic regulation in development and disease. New generation sequencers allow genome-wide measurements of the methylation status by reading short stretches of the DNA sequence (Methyl-seq). Several software tools for methylation analysis have been proposed over recent years. However, the current trend is that the new sequencers and the ones expected for an upcoming future yield sequences of increasing length, making these software tools inefficient and obsolete. Results In this paper, we propose a new software based on a strategy for methylation analysis of Methyl-seq sequencing data that requires much shorter execution times while…
SpCLUST: Towards a fast and reliable clustering for potentially divergent biological sequences
2019
International audience; This paper presents SpCLUST, a new C++ package that takes a list of sequences as input, aligns them with MUSCLE, computes their similarity matrix in parallel and then performs the clustering. SpCLUST extends a previously released software by integrating additional scoring matrices which enables it to cover the clustering of amino-acid sequences. The similarity matrix is now computed in parallel according to the master/slave distributed architecture, using MPI. Performance analysis, realized on two real datasets of 100 nucleotide sequences and 1049 amino-acids ones, show that the resulting library substantially outperforms the original Python package. The proposed pac…
On the Use of Binary Trees for DNA Hydroxymethylation Analysis
2017
DNA methylation (mC) and hydroxymethylation (hmC) can have a significant effect on normal human development, health and disease status. Hydroxymethylation studies require specific treatment of DNA, as well as software tools for their analysis. In this paper, we propose a parallel software tool for analyzing the DNA hydroxymethylation data obtained by TAB-seq. The software is based on the use of binary trees for searching the different occurrences of methylation and hydroxymethylation in DNA samples. The binary trees allow to efficiently store and access the information about the methylation of each methylated/hydroxymethylated cytosines in the samples. Evaluation results shows that the perf…
Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms
2018
Abstract Motivation Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in…