Search results for "Combinatorics on words"
showing 9 items of 49 documents
On The Maximum Number of Abelian Squares in a Word
2014
Strings (aka sequences or words) form the most basic and natural data structure. They occur whenever information is electronically transmitted (as bit streams), when natural language text is spoken or written down (as words over, for example, the Latin alphabet), in the process of heredity transmission in living cells (through DNA sequences) or the protein synthesis (assequence of amino acids), and in many more different contexts
Words, Trees and Automata Minimization
2013
In this paper we explore some connections between some combinatorial properties of words and the study of extremal cases of the automata minimization process. An intermediate role is played by the notion od word trees for which some properties of words are generalized. In particular, we describe an infinite family of binary automata, called word automata and constructed by using standard sturmian words and more specifically Fibonacci words, that represent the extremal case of some well known automata minimization algorithms, such as Moore’s and Hopcroft’s methods. As well as giving an overview of the main results in this context, the main purpose of this paper is to prove that, even if a re…
A New Class of Searchable and Provably Highly Compressible String Transformations
2019
The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the "myriad virtues" of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search direc…
Special factors and the combinatorics of suffix and factor automata
2011
AbstractThe suffix automaton (resp. factor automaton) of a finite word w is the minimal deterministic automaton recognizing the set of suffixes (resp. factors) of w. We study the relationships between the structure of the suffix and factor automata and classical combinatorial parameters related to the special factors of w. We derive formulae for the number of states of these automata. We also characterize the languages LSA and LFA of words having respectively suffix automaton and factor automaton with the minimal possible number of states.
The Shuffle Product: New Research Directions
2015
In this paper we survey some recent researches concerning the shuffle operation that arise both in Formal Languages and in Combinatorics on Words.
The Burrows-Wheeler Transform between Data Compression and Combinatorics on Words
2013
The Burrows-Wheeler Transform (BWT) is a tool of fundamental importance in Data Compression and, recently, has found many applications well beyond its original purpose. The main goal of this paper is to highlight the mathematical and combinatorial properties on which the outstanding versatility of the $BWT$ is based, i.e. its reversibility and the clustering effect on the output. Such properties have aroused curiosity and fervent interest in the scientific world both for theoretical aspects and for practical effects. In particular, in this paper we are interested both to survey the theoretical research issues which, by taking their cue from Data Compression, have been developed in the conte…
On the suffix automaton with mismatches
2007
International audience; In this paper we focus on the construction of the minimal deterministic finite automaton S_k that recognizes the set of suffixes of a word w up to k errors. We present an algorithm that makes use of S_k in order to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r, where r is the value of the repetition index of w. Moreover, we give some experimental results on some well-known words, like prefixes of Fibonacci and Thue-Morse words, and we make a conjecture on the size of the suffix automaton with mismatches.
BALANCE PROPERTIES AND DISTRIBUTION OF SQUARES IN CIRCULAR WORDS
2010
We study balance properties of circular words over alphabets of size greater than two. We give some new characterizations of balanced words connected to the Kawasaki-Ising model and to the notion of derivative of a word. Moreover we consider two different generalizations of the notion of balance, and we find some relations between them. Some of our results can be generalized to non periodic infinite words as well.
Functional Information, Biomolecular Messages and Complexity of BioSequences and Structures
2010
In the quest for a mathematical measure able to capture and shed light on the dual notions of information and complexity in biosequences, Hazen et al. have introduced the notion of Functional Information (FI for short). It is also the result of earlier considerations and findings by Szostak and Carothers et al. Based on the experiments by Charoters et al., regarding FI in RNA binding activities, we decided to study the relation existing between FI and classic measures of complexity applied on protein-DNA interactions on a genome-wide scale. Using classic complexity measures, i.e, Shannon entropy and Kolmogorov Complexity as both estimated by data compression, we found that FI applied to pro…