Search results for "Hash function"
showing 7 items of 27 documents
Classical and Quantum Computations with Restricted Memory
2018
Automata and branching programs are known models of computation with restricted memory. These models of computation were in focus of a large number of researchers during the last decades. Streaming algorithms are a modern model of computation with restricted memory. In this paper, we present recent results on the comparative computational power of quantum and classical models of branching programs and streaming algorithms.
Evaluation of a hash-compress-encrypt pipeline for storage system applications
2015
Great efforts are made to store data in a secure, reliable, and authentic way in large storage systems. Specialized, system specific clients help to achieve these goals. Nevertheless, often standard tools for hashing, compressing, and encrypting data are arranged in transparent pipelines. We analyze the potential of Unix shell pipelines with several high-speed and high-compression algorithms that can be used to achieve data security, reduction, and authenticity. Furthermore, we compare the pipelines of standard tools against a house made pipeline implemented in C++ and show that there is great potential for performance improvement.
RNACache: Fast Mapping of RNA-Seq Reads to Transcriptomes Using MinHashing
2021
The alignment of reads to a transcriptome is an important initial step in a variety of bioinformatics RNA-seq pipelines. As traditional alignment-based tools suffer from high runtimes, alternative, alignment-free methods have recently gained increasing importance. We present a novel approach to the detection of local similarities between transcriptomes and RNA-seq reads based on context-aware minhashing. We introduce RNACache, a three-step processing pipeline consisting of minhashing of k-mers, match-based (online) filtering, and coverage-based filtering in order to identify truly expressed transcript isoforms. Our performance evaluation shows that RNACache produces transcriptomic mappings …
Freenet-like GUIDs for implementing xanalogical hypertext
2002
We discuss the use of Freenet-like content hash GUIDs as a primitive for implementing the Xanadu model in a peer-to-peer framework. Our current prototype is able to display the implicit connection (transclusion) between two different references to the same permanent ID. We discuss the next layers required in the implementation of the Xanadu model on a world-wide peer-to-peer network.
Locality-sensitive hashing enables signal classification in high-throughput mass spectrometry raw data at scale
2021
Mass spectrometry is an important experimental technique in the field of proteomics. However, analysis of certain mass spectrometry data faces a combination of two challenges: First, even a single experiment produces a large amount of multi-dimensional raw data and, second, signals of interest are not single peaks but patterns of peaks that span along the different dimensions. The rapidly growing amount of mass spectrometry data increases the demand for scalable solutions. Existing approaches for signal detection are usually not well suited for processing large amounts of data in parallel or rely on strong assumptions concerning the signals properties. In this study, it is shown that locali…
Perfect Hashing Structures for Parallel Similarity Searches
2015
International audience; Seed-based heuristics have proved to be efficient for studying similarity between genetic databases with billions of base pairs. This paper focuses on algorithms and data structures for the filtering phase in seed-based heuristics, with an emphasis on efficient parallel GPU/manycores implementa- tion. We propose a 2-stage index structure which is based on neighborhood indexing and perfect hashing techniques. This structure performs a filtering phase over the neighborhood regions around the seeds in constant time and avoid as much as possible random memory accesses and branch divergences. Moreover, it fits particularly well on parallel SIMD processors, because it requ…
How to Improve the Reliability of Chord?
2008
In this paper we focus on Chord P2P protocol and we study the process of unexpected departures of nodes from this system. Each of such departures may effect in losing any information and in classical versions of this protocol the probability of losing some information is proportional to the quantity of information put into this system. This effect can be partially solved by gathering in the protocol multiple copies (replicas) of information. The replication mechanism was proposed by many authors. We present a detailed analysis of one variant of blind replication and show that this solution only partially solves the problem. Next we propose two less obvious modifications of the Chord protoco…