0000000000042378
AUTHOR
Sergio Salemi
Text Compression Using Antidictionaries
International audience; We give a new text compression scheme based on Forbidden Words ("antidictionary"). We prove that our algorithms attain the entropy for balanced binary sources. They run in linear time. Moreover, one of the main advantages of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and allows parallel compression. Our algorithms can also be presented as "compilers" that create compressors dedicated to any previously fixed source. The techniques used in this paper are from Information Theory and Finite Automata.
A Periodicity Theorem on Words and Applications
We prove a periodicity theorem on words that has strong analogies with the Critical Factorization theorem and we show three applications of it.
Overlap free words on two symbols
On the trace product and some families of languages closed under partial commutation
Binary Patterns in Infinite Binary Words
In this paper we study the set P(w) of binary patterns that can occur in one infinite binary word w, comparing it with the set F(w) of factors of the word. Since the set P(w) can be considered as an extension of the set F(w), we first investigate how large is such extension, by introducing the parameter ?(w) that corresponds to the cardinality of the difference set P(w) \ F(w). Some non trivial results about such parameter are obtained in the case of the Thue-Morse and the Fibonacci words. Since, in most cases, the parameter ?(w) is infinite, we introduce the pattern complexity of w, which corresponds to the complexity of the language P(w). As a main result, we prove that there exist infini…
A generalization of Sardinas and Patterson's algorithm to z-codes
Abstract This paper concerns the framework of z-codes theory. The main contribution consists in an extension of the algorithm of Sardinas and Patterson for deciding whether a finite set of words X is a z-code. To improve the efficiency of this test we have found a tight upper bound on the length of the shortest words that might have a double z-factorization over X. Some remarks on the complexity of the algorithm are also given. Moreover, a slight modification of this algorithm allows us to compute the z-deciphering delay of X.
On aperiodic trace languages
Some Decision Results on Nonrepetitive Words
The paper addresses some generalizations of the Thue Problem such as: given a word u, does there exist an infinite nonrepetitive overlap free (or square free) word having u as a prefix? A solution to this as well as to related problems is given for the case of overlap free words on a binary alphabet.
Star-free trace languages
Abstract Generalizing a classical result of Schutzenberger to free partially commutative monoids, we prove that the family of star-free trace languages coincides with the family of aperiodic trace languages.
Words and Patterns
In this paper some new ideas, problems and results on patterns are proposed. In particular, motivated by questions concerning avoidability, we first study the set of binary patterns that can occur in one infinite binary word, comparing it with the set of factors of the word. This suggests a classification of infinite words in terms of the "difference" between the set of its patterns and the set of its factors. The fact that each factor in an infinite word can give rise to several distinct patterns leads to study the set of patterns of a single finite word. This set, endowed with a natural order relation, defines a poset: we investigate the relationships between the structure of such a poset…