Search results for "ALGORITHMS"
showing 10 items of 1716 documents
Quantum versus Classical Online Streaming Algorithms with Advice
2018
We consider online algorithms with respect to the competitive ratio. Here, we investigate quantum and classical one-way automata with non-constant size of memory (streaming algorithms) as a model for online algorithms. We construct problems that can be solved by quantum online streaming algorithms better than by classical ones in a case of logarithmic or sublogarithmic size of memory, even if classical online algorithms get advice bits. Furthermore, we show that a quantum online algorithm with a constant number of qubits can be better than any deterministic online algorithm with a constant number of advice bits and unlimited computational power.
Quantum versus Classical Online Streaming Algorithms with Logarithmic Size of Memory
2023
We consider online algorithms with respect to the competitive ratio. Here, we investigate quantum and classical one-way automata with non-constant size of memory (streaming algorithms) as a model for online algorithms. We construct problems that can be solved by quantum online streaming algorithms better than by classical ones in a case of logarithmic or sublogarithmic size of memory.
Computing the original eBWT faster, simpler, and with less memory
2021
Mantaci et al. [TCS 2007] defined the eBWT to extend the definition of the BWT to a collection of strings, however, since this introduction, it has been used more generally to describe any BWT of a collection of strings and the fundamental property of the original definition (i.e., the independence from the input order) is frequently disregarded. In this paper, we propose a simple linear-time algorithm for the construction of the original eBWT, which does not require the preprocessing of Bannai et al. [CPM 2021]. As a byproduct, we obtain the first linear-time algorithm for computing the BWT of a single string that uses neither an end-of-string symbol nor Lyndon rotations. We combine our ne…
Substring Complexity in Sublinear Space
2020
Shannon's entropy is a definitive lower bound for statistical compression. Unfortunately, no such clear measure exists for the compressibility of repetitive strings. Thus, ad-hoc measures are employed to estimate the repetitiveness of strings, e.g., the size $z$ of the Lempel-Ziv parse or the number $r$ of equal-letter runs of the Burrows-Wheeler transform. A more recent one is the size $\gamma$ of a smallest string attractor. Unfortunately, Kempa and Prezza [STOC 2018] showed that computing $\gamma$ is NP-hard. Kociumaka et al. [LATIN 2020] considered a new measure that is based on the function $S_T$ counting the cardinalities of the sets of substrings of each length of $T$, also known as …
A Constructive Arboricity Approximation Scheme
2018
The arboricity $\Gamma$ of a graph is the minimum number of forests its edge set can be partitioned into. Previous approximation schemes were nonconstructive, i.e., they only approximated the arboricity as a value without computing a corresponding forest partition. This is because they operate on the related pseudoforest partitions or the dual problem of finding dense subgraphs. We propose an algorithm for converting a partition of $k$ pseudoforests into a partition of $k+1$ forests in $O(mk\log k + m \log n)$ time with a data structure by Brodal and Fagerberg that stores graphs of arboricity $k$. A slightly better bound can be given when perfect hashing is used. When applied to a pseudofor…
Burrows Wheeler Transform on a Large Scale: Algorithms Implemented in Apache Spark
2021
With the rapid growth of Next Generation Sequencing (NGS) technologies, large amounts of "omics" data are daily collected and need to be processed. Indexing and compressing large sequences datasets are some of the most important tasks in this context. Here we propose algorithms for the computation of Burrows Wheeler transform relying on Big Data technologies, i.e., Apache Spark and Hadoop. Our algorithms are the first ones that distribute the index computation and not only the input dataset, allowing to fully benefit of the available cloud resources.
Self-stabilizing Balls & Bins in Batches
2016
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modeled as static balls into bins processes, where $m$ balls (tasks) are to be distributed to $n$ bins (servers). In a seminal work, Azar et al. proposed the sequential strategy \greedy{d} for $n=m$. When thrown, a ball queries the load of $d$ random bins and is allocated to a least loaded of these. Azar et al. showed that $d=2$ yields an exponential improvement compared to $d=1$. Berenbrink et al. extended this to $m\gg n$, showing that the maximal load difference is independent of $m$ for $d=2$ (in contrast to $d=1$). W…
Multi-label Methods for Prediction with Sequential Data
2017
The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation inves…
Using the Tsetlin Machine to Learn Human-Interpretable Rules for High-Accuracy Text Categorization With Medical Applications
2019
Medical applications challenge today's text categorization techniques by demanding both high accuracy and ease-of-interpretation. Although deep learning has provided a leap ahead in accuracy, this leap comes at the sacrifice of interpretability. To address this accuracy-interpretability challenge, we here introduce, for the first time, a text categorization approach that leverages the recently introduced Tsetlin Machine. In all brevity, we represent the terms of a text as propositional variables. From these, we capture categories using simple propositional formulae, such as: if "rash" and "reaction" and "penicillin" then Allergy. The Tsetlin Machine learns these formulae from a labelled tex…
Ockham's Razor in Memetic Computing: Three Stage Optimal Memetic Exploration
2012
Memetic computing is a subject in computer science which considers complex structures as the combination of simple agents, memes, whose evolutionary interactions lead to intelligent structures capable of problem-solving. This paper focuses on memetic computing optimization algorithms and proposes a counter-tendency approach for algorithmic design. Research in the field tends to go in the direction of improving existing algorithms by combining different methods or through the formulation of more complicated structures. Contrary to this trend, we instead focus on simplicity, proposing a structurally simple algorithm with emphasis on processing only one solution at a time. The proposed algorit…