Search results for "Tribute"
showing 10 items of 1455 documents
AnySeq: A High Performance Sequence Alignment Library based on Partial Evaluation
2020
Sequence alignments are fundamental to bioinformatics which has resulted in a variety of optimized implementations. Unfortunately, the vast majority of them are hand-tuned and specific to certain architectures and execution models. This not only makes them challenging to understand and extend, but also difficult to port to other platforms. We present AnySeq - a novel library for computing different types of pairwise alignments of DNA sequences. Our approach combines high performance with an intuitively understandable implementation, which is achieved through the concept of partial evaluation. Using the AnyDSL compiler framework, AnySeq enables the compilation of algorithmic variants that ar…
Finding optimal finite biological sequences over finite alphabets: the OptiFin toolbox
2017
International audience; In this paper, we present a toolbox for a specific optimization problem that frequently arises in bioinformatics or genomics. In this specific optimisation problem, the state space is a set of words of specified length over a finite alphabet. To each word is associated a score. The overall objective is to find the words which have the lowest possible score. This type of general optimization problem is encountered in e.g 3D conformation optimisation for protein structure prediction, or largest core genes subset discovery based on best supported phylogenetic tree for a set of species. In order to solve this problem, we propose a toolbox that can be easily launched usin…
A Big Data Approach for Sequences Indexing on the Cloud via Burrows Wheeler Transform
2020
Indexing sequence data is important in the context of Precision Medicine, where large amounts of ``omics'' data have to be daily collected and analyzed in order to categorize patients and identify the most effective therapies. Here we propose an algorithm for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our approach is the first that distributes the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources.
A gap analysis of Internet-of-Things platforms
2016
We are experiencing an abundance of Internet-of-Things (IoT) middleware solutions that provide connectivity for sensors and actuators to the Internet. To gain a widespread adoption, these middleware solutions, referred to as platforms, have to meet the expectations of different players in the IoT ecosystem, including device providers, application developers, and end-users, among others. In this article, we evaluate a representative sample of these platforms, both proprietary and open-source, on the basis of their ability to meet the expectations of different IoT users. The evaluation is thus more focused on how ready and usable these platforms are for IoT ecosystem players, rather than on t…
FIRST
2018
Thanks to the collective action of participating smartphone users, mobile crowdsensing allows data collection at a scale and pace that was once impossible. The biggest challenge to overcome in mobile crowdsensing is that participants may exhibit malicious or unreliable behavior, thus compromising the accuracy of the data collection process. Therefore, it becomes imperative to design algorithms to accurately classify between reliable and unreliable sensing reports. To address this crucial issue, we propose a novel Framework for optimizing Information Reliability in Smartphone-based participaTory sensing (FIRST) that leverages mobile trusted participants (MTPs) to securely assess the reliabil…
Constrained Role Mining
2013
Role Based Access Control (RBAC) is a very popular access control model, for long time investigated and widely deployed in the security architecture of different enterprises. To implement RBAC, roles have to be firstly identified within the considered organization. Usually the process of (automatically) defining the roles in a bottom up way, starting from the permissions assigned to each user, is called {\it role mining}. In literature, the role mining problem has been formally analyzed and several techniques have been proposed in order to obtain a set of valid roles. Recently, the problem of defining different kind of constraints on the number and the size of the roles included in the resu…
Parallel In-Memory Evaluation of Spatial Joins
2019
The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parame…
Burrows Wheeler Transform on a Large Scale: Algorithms Implemented in Apache Spark
2021
With the rapid growth of Next Generation Sequencing (NGS) technologies, large amounts of "omics" data are daily collected and need to be processed. Indexing and compressing large sequences datasets are some of the most important tasks in this context. Here we propose algorithms for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our algorithms are the first ones that distribute the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources.
Concurrent Computing with Shared Replicated Memory
2019
The behavioural theory of concurrent systems states that any concurrent system can be captured by a behaviourally equivalent concurrent Abstract State Machine (cASM). While the theory in general assumes shared locations, it remains valid, if different agents can only interact via messages, i.e. sharing is restricted to mailboxes. There may even be a strict separation between memory managing agents and other agents that can only access the shared memory by sending query and update requests to the memory agents. This article is dedicated to an investigation of replicated data that is maintained by a memory management subsystem, whereas the replication neither appears in the requests nor in th…
Self-stabilizing Balls & Bins in Batches
2016
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modeled as static balls into bins processes, where $m$ balls (tasks) are to be distributed to $n$ bins (servers). In a seminal work, Azar et al. proposed the sequential strategy \greedy{d} for $n=m$. When thrown, a ball queries the load of $d$ random bins and is allocated to a least loaded of these. Azar et al. showed that $d=2$ yields an exponential improvement compared to $d=1$. Berenbrink et al. extended this to $m\gg n$, showing that the maximal load difference is independent of $m$ for $d=2$ (in contrast to $d=1$). W…