Search results for "data structure"
showing 10 items of 441 documents
Learning from Data to Speed-up Sorted Table Search Procedures: Methodology and Practical Guidelines
2020
Sorted Table Search Procedures are the quintessential query-answering tool, with widespread usage that now includes also Web Applications, e.g, Search Engines (Google Chrome) and ad Bidding Systems (AppNexus). Speeding them up, at very little cost in space, is still a quite significant achievement. Here we study to what extend Machine Learning Techniques can contribute to obtain such a speed-up via a systematic experimental comparison of known efficient implementations of Sorted Table Search procedures, with different Data Layouts, and their Learned counterparts developed here. We characterize the scenarios in which those latter can be profitably used with respect to the former, accounting …
On the Complexity of Solving Subtraction Games
2018
We study algorithms for solving Subtraction games, which sometimes are referred to as one-heap Nim games. We describe a quantum algorithm which is applicable to any game on DAG, and show that its query compexity for solving an arbitrary Subtraction game of $n$ stones is $O(n^{3/2}\log n)$. The best known deterministic algorithms for solving such games are based on the dynamic programming approach. We show that this approach is asymptotically optimal and that classical query complexity for solving a Subtraction game is generally $\Theta(n^2)$. This paper perhaps is the first explicit "quantum" contribution to algorithmic game theory.
Lightweight LCP construction for very large collections of strings
2016
The longest common prefix array is a very advantageous data structure that, combined with the suffix array and the Burrows-Wheeler transform, allows to efficiently compute some combinatorial properties of a string useful in several applications, especially in biological contexts. Nowadays, the input data for many problems are big collections of strings, for instance the data coming from "next-generation" DNA sequencing (NGS) technologies. In this paper we present the first lightweight algorithm (called extLCP) for the simultaneous computation of the longest common prefix array and the Burrows-Wheeler transform of a very large collection of strings having any length. The computation is reali…
Finding k -dissimilar paths with minimum collective length
2018
Shortest path computation is a fundamental problem in road networks. However, in many real-world scenarios, determining solely the shortest path is not enough. In this paper, we study the problem of finding k-Dissimilar Paths with Minimum Collective Length (kDPwML), which aims at computing a set of paths from a source s to a target t such that all paths are pairwise dissimilar by at least \theta and the sum of the path lengths is minimal. We introduce an exact algorithm for the kDPwML problem, which iterates over all possible s-t paths while employing two pruning techniques to reduce the prohibitively expensive computational cost. To achieve scalability, we also define the much smaller set …
Improving table compression with combinatorial optimization
2002
We study the problem of compressing massive tables within the partition-training paradigm introduced by Buchsbaum et al. [SODA'00], in which a table is partitioned by an off-line training procedure into disjoint intervals of columns, each of which is compressed separately by a standard, on-line compressor like gzip. We provide a new theory that unifies previous experimental observations on partitioning and heuristic observations on column permutation, all of which are used to improve compression rates. Based on the theory, we devise the first on-line training algorithms for table compression, which can be applied to individual files, not just continuously operating sources; and also a new, …
Some complexity and approximation results for coupled-tasks scheduling problem according to topology
2016
International audience; We consider the makespan minimization coupled-tasks problem in presence of compatibility constraints with a specified topology. In particular, we focus on stretched coupled-tasks, i.e. coupled-tasks having the same sub-tasks execution time and idle time duration. We study several problems in framework of classic complexity and approximation for which the compatibility graph is bipartite (star, chain,. . .). In such a context, we design some efficient polynomial-time approximation algorithms for an intractable scheduling problem according to some parameters.
Algorithms for Computing Abelian Periods of Words
2012
Constantinescu and Ilie (Bulletin EATCS 89, 167--170, 2006) introduced the notion of an \emph{Abelian period} of a word. A word of length $n$ over an alphabet of size $\sigma$ can have $\Theta(n^{2})$ distinct Abelian periods. The Brute-Force algorithm computes all the Abelian periods of a word in time $O(n^2 \times \sigma)$ using $O(n \times \sigma)$ space. We present an off-line algorithm based on a $\sel$ function having the same worst-case theoretical complexity as the Brute-Force one, but outperforming it in practice. We then present on-line algorithms that also enable to compute all the Abelian periods of all the prefixes of $w$.
On Combinatorial Generation of Prefix Normal Words
2014
A prefix normal word is a binary word with the property that no substring has more 1s than the prefix of the same length. This class of words is important in the context of binary jumbled pattern matching. In this paper we present an efficient algorithm for exhaustively listing the prefix normal words with a fixed length. The algorithm is based on the fact that the language of prefix normal words is a bubble language, a class of binary languages with the property that, for any word w in the language, exchanging the first occurrence of 01 by 10 in w results in another word in the language. We prove that each prefix normal word is produced in O(n) amortized time, and conjecture, based on expe…
Pattern detection in ordered graphs
2023
A popular way to define or characterize graph classes is via forbidden subgraphs or forbidden minors. These characterizations play a key role in graph theory, but they rarely lead to efficient algorithms to recognize these classes. In contrast, many essential graph classes can be recognized efficiently thanks to characterizations of the following form: there must exist an ordering of the vertices such that some ordered pattern does not appear, where a pattern is basically an ordered subgraph. These pattern characterizations have been studied for decades, but there have been recent efforts to better understand them systematically. In this paper, we focus on a simple problem at the core of th…
Algorithms for Anti-Powers in Strings
2018
Abstract A string S [ 1 , n ] is a power (or tandem repeat) of order k and period n / k if it can be decomposed into k consecutive equal-length blocks of letters. Powers and periods are fundamental to string processing, and algorithms for their efficient computation have wide application and are heavily studied. Recently, Fici et al. (Proc. ICALP 2016) defined an anti-power of order k to be a string composed of k pairwise-distinct blocks of the same length ( n / k , called anti-period). Anti-powers are a natural converse to powers, and are objects of combinatorial interest in their own right. In this paper we initiate the algorithmic study of anti-powers. Given a string S, we describe an op…