0000000001330950

AUTHOR

Filippo Mignosi

showing 47 related works from this author

Text Compression Using Antidictionaries

1999

International audience; We give a new text compression scheme based on Forbidden Words ("antidictionary"). We prove that our algorithms attain the entropy for balanced binary sources. They run in linear time. Moreover, one of the main advantages of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and allows parallel compression. Our algorithms can also be presented as "compilers" that create compressors dedicated to any previously fixed source. The techniques used in this paper are from Information Theory and Finite Automata.

Theoretical computer scienceFinite-state machineComputer science[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]010102 general mathematicsforbidden wordData_CODINGANDINFORMATIONTHEORY0102 computer and information sciencesInformation theory01 natural sciencesfinite automatonParallel compressionpattern matching010201 computation theory & mathematicsEntropy (information theory)Pattern matching0101 mathematicsTime complexityAlgorithmdata compressioninformation theoryData compression
researchProduct

Periodicity, morphisms, and matrices

2003

In 1965, Fine and Wilf proved the following theorem: if (fn)n≥0 and (gn)n≥0 are periodic sequences of real numbers, of period lengths h and k, respectively, and fn = gn for 0 ≤ n > h + k - gcd(h,k), then fn = gn for all n ≥ 0. Furthermore, the constant h + k - gcd(h,k) is best possible. In this paper, we consider some variations on this theorem. In particular, we study the case where fn ≤ gn, instead of fn = gn. We also obtain generalizations to more than two periods.We apply our methods to a previously unsolved conjecture on iterated morphisms, the decreasing length conjecture: if h : Σ* → Σ* is a morphism with |Σ|= n, and w is a word with |w| < |h(w)| < |h2(w)| < ... < |hk(w)|, then k ≤ n.

PeriodicityConjectureGeneral Computer Science010102 general mathematicsSturmian wordSturmian wordIterated morphism0102 computer and information sciences01 natural sciencesTheoretical Computer ScienceCombinatoricsMorphism010201 computation theory & mathematicsMatrix algebraIterated function0101 mathematicsWord (group theory)Real numberMathematicsComputer Science(all)Theoretical Computer Science
researchProduct

Automata and forbidden words

1998

Abstract Let L ( M ) be the (factorial) language avoiding a given anti-factorial language M . We design an automaton accepting L ( M ) and built from the language M . The construction is effective if M is finite. If M is the set of minimal forbidden words of a single word ν, the automaton turns out to be the factor automaton of ν (the minimal automaton accepting the set of factors of ν). We also give an algorithm that builds the trie of M from the factor automaton of a single word. It yields a nontrivial upper bound on the number of minimal forbidden words of a word.

TheoryofComputation_COMPUTATIONBYABSTRACTDEVICES[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Büchi automaton0102 computer and information sciences02 engineering and technologyω-automaton01 natural sciencesTheoretical Computer ScienceCombinatoricsDeterministic automaton0202 electrical engineering electronic engineering information engineeringTwo-way deterministic finite automatonNondeterministic finite automatonMathematicsPowerset constructionLevenshtein automaton020206 networking & telecommunicationsComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Nonlinear Sciences::Cellular Automata and Lattice GasesComputer Science ApplicationsTheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES010201 computation theory & mathematicsSignal ProcessingProbabilistic automatonComputer Science::Programming LanguagesComputer Science::Formal Languages and Automata TheoryInformation Systems
researchProduct

The rightmost equal-cost position problem.

2013

LZ77-based compression schemes compress the input text by replacing factors in the text with an encoded reference to a previous occurrence formed by the couple (length, offset). For a given factor, the smallest is the offset, the smallest is the resulting compression ratio. This is optimally achieved by using the rightmost occurrence of a factor in the previous text. Given a cost function, for instance the minimum number of bits used to represent an integer, we define the Rightmost Equal-Cost Position (REP) problem as the problem of finding one of the occurrences of a factor whose cost is equal to the cost of the rightmost one. We present the Multi-Layer Suffix Tree data structure that, for…

FOS: Computer and information sciencesOffset (computer science)Computer scienceSuffix treeComputer Science - Information Theorylaw.inventionCombinatoricslawLog-log plotComputer Science - Data Structures and AlgorithmsCompression schemetext compressiondictionary text compressionData Structures and Algorithms (cs.DS)LZ77 compressiondata compressionLossless compressionfull text indexSuffix Tree Data StructuresSettore INF/01 - InformaticaInformation Theory (cs.IT)Data structurePrefixCompression ratioCompression scheme; Constant time; Suffix Tree Data StructuresAlgorithmData compressionConstant time
researchProduct

Variations on a Theorem of Fine &amp; Wilf

2001

In 1965, Fine & Wilf proved the following theorem: if (fn)n≥0 and (gn)n≥0 are periodic sequences of real numbers, of periods h and k respectively, and fn = gn for 0 ≤ n ≤ h+k-gcd(h, k), then fn = gn for all n ≥ 0. Furthermore, the constant h + k - gcd(h, k) is best possible. In this paper we consider some variations on this theorem. In particular, we study the case where fn ≤ gn instead of fn = gn. We also obtain a generalization to more than two periods.

CombinatoricsNumber theoryPeriodic sequenceArithmeticPeriod lengthMathematicsReal number
researchProduct

Forbidden Factors and Fragment Assembly

2001

In this paper methods and results related to the notion of minimal forbidden words are applied to the fragment assembly problem. The fragment assembly problem can be formulated, in its simplest form, as follows: reconstruct a word w from a given set I of substrings (fragments ) of a word w . We introduce an hypothesis involving the set of fragments I and the maximal length m(w) of the minimal forbidden factors of w . Such hypothesis allows us to reconstruct uniquely the word w from the set I in linear time. We prove also that, if w is a word randomly generated by a memoryless source with identical symbol probabilities, m(w) is logarithmic with respect to the size of w . This result shows th…

CombinatoricsSet (abstract data type)Fragment (logic)LogarithmDeterministic automatonSymbol (programming)General MathematicsTime complexitySoftwareWord (computer architecture)SubstringComputer Science ApplicationsMathematicsRAIRO - Theoretical Informatics and Applications
researchProduct

Languages with mismatches

2007

AbstractIn this paper we study some combinatorial properties of a class of languages that represent sets of words occurring in a text S up to some errors. More precisely, we consider sets of words that occur in a text S with k mismatches in any window of size r. The study of this class of languages mainly focuses both on a parameter, called repetition index, and on the set of the minimal forbidden words of the language of factors of S with errors. The repetition index of a string S is defined as the smallest integer such that all strings of this length occur at most in a unique position of the text S up to errors. We prove that there is a strong relation between the repetition index of S an…

Combinatorics on wordsApproximate string matchingGeneral Computer ScienceRepetition (rhetorical device)String (computer science)Search engine indexingComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Approximate string matchingData structureTheoretical Computer ScienceCombinatoricsSet (abstract data type)Formal languagesCombinatorics on words Formal languages Approximate string matching IndexingIndexingWord (group theory)MathematicsInteger (computer science)Computer Science(all)Theoretical Computer Science
researchProduct

A Note on a Conjecture of Duval and Sturmian Words

2002

We prove a long standing conjecture of Duval in the special case of Sturmian words. Mathematics Subject Classication. ??????????????. Let U be a nonempty word on a nite alphabet A: A nonempty word B dierent from U is called a border of U if B is both a prex and sux of U: We say U is bordered if U admits a border, otherwise U is said to be unbordered. For example, U = 011001011 is bordered by the factor 011; while 00010001001 is unbordered. An integer 1 k n is a period of a word U = U1 :::U n if and only if for all 1 i n k we have Ui = Ui+k. It is easy to see that k is a period of U if and only if the prex B of U of length n k is a border of U or is empty. Let (U) denote the smallest period …

CombinatoricsMorphismConjectureIntegerGeneral MathematicsSturmian wordAlphabetSoftwareWord (group theory)Computer Science ApplicationsMathematics
researchProduct

Dictionary-symbolwise flexible parsing

2012

AbstractLinear-time optimal parsing algorithms are rare in the dictionary-based branch of the data compression theory. A recent result is the Flexible Parsing algorithm of Matias and Sahinalp (1999) that works when the dictionary is prefix closed and the encoding of dictionary pointers has a constant cost. We present the Dictionary-Symbolwise Flexible Parsing algorithm that is optimal for prefix-closed dictionaries and any symbolwise compressor under some natural hypothesis. In the case of LZ78-like algorithms with variable costs and any, linear as usual, symbolwise compressor we show how to implement our parsing algorithm in linear time. In the case of LZ77-like dictionaries and any symbol…

Theoretical computer scienceComputer science[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS][INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]Data_CODINGANDINFORMATIONTHEORY0102 computer and information sciences02 engineering and technologycomputer.software_genre01 natural sciencesDirected acyclic graphTheoretical Computer ScienceConstant (computer programming)020204 information systemsEncoding (memory)Optimal parsing0202 electrical engineering electronic engineering information engineeringDiscrete Mathematics and CombinatoricsStringologySymbolwise text compressionTime complexityLossless compressionParsingSettore INF/01 - InformaticaDictionary-based compressionOptimal Parsing Lossless Data Compression DAGDirected acyclic graphPrefixComputational Theory and MathematicsText compression010201 computation theory & mathematicsAlgorithmcomputerBottom-up parsingData compressionJournal of Discrete Algorithms
researchProduct

A NEW COMPLEXITY FUNCTION FOR WORDS BASED ON PERIODICITY

2013

Motivated by the extension of the critical factorization theorem to infinite words, we study the (local) periodicity function, i.e. the function that, for any position in a word, gives the size of the shortest square centered in that position. We prove that this function characterizes any binary word up to exchange of letters. We then introduce a new complexity function for words (the periodicity complexity) that, for any position in the word, gives the average value of the periodicity function up to that position. The new complexity function is independent from the other commonly used complexity measures as, for instance, the factor complexity. Indeed, whereas any infinite word with bound…

Average-case complexityDiscrete mathematicsFibonacci numberSettore INF/01 - InformaticaGeneral Mathematicscomplexity functionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Function (mathematics)periodicitycritical factorization theoremCombinatoricsComplexity indexCombinatorics on wordsBounded functionComplexity functionComputer Science::Formal Languages and Automata TheoryWord (computer architecture)Combinatorics on wordMathematicsInternational Journal of Algebra and Computation
researchProduct

Minimal forbidden words and factor automata

1998

International audience; Let L(M) be the (factorial) language avoiding a given antifactorial language M. We design an automaton accepting L(M) and built from the language M. The construction is eff ective if M is finite. If M is the set of minimal forbidden words of a single word v, the automaton turns out to be the factor automaton of v (the minimal automaton accepting the set of factors of v). We also give an algorithm that builds the trie of M from the factor automaton of a single word. It yields a non-trivial upper bound on the number of minimal forbidden words of a word.

TheoryofComputation_COMPUTATIONBYABSTRACTDEVICESfailure functionfactor code[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Büchi automatonComputerApplications_COMPUTERSINOTHERSYSTEMS[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciencesavoiding a wordω-automaton01 natural sciencesfactorial languageReversible cellular automatonCombinatoricsDeterministic automatonanti-factorial languageNondeterministic finite automaton0101 mathematicsMathematicsfactor automatonPowerset constructionLevenshtein automaton010102 general mathematicsforbidden wordComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)16. Peace & justiceNonlinear Sciences::Cellular Automata and Lattice GasesTheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES010201 computation theory & mathematicsProbabilistic automatonPhysics::Accelerator PhysicsComputer Science::Programming LanguagesHigh Energy Physics::ExperimentComputer Science::Formal Languages and Automata Theory
researchProduct

"Indexing structures for approximate string matching

2003

In this paper we give the first, to our knowledge, structures and corresponding algorithms for approximate indexing, by considering the Hamming distance, having the following properties. i) Their size is linear times a polylog of the size of the text on average. ii) For each pattern x, the time spent by our algorithms for finding the list occ(x) of all occurrences of a pattern x in the text, up to a certain distance, is proportional on average to |x| + |occ(x)|, under an additional but realistic hypothesis.

CombinatoricsCombinatorics on wordsPattern recognition (psychology)Search engine indexingAutomata theoryHamming distanceString searching algorithmApproximate string matchingTime complexityMathematics
researchProduct

A trie-based approach for compacting automata

2004

International audience; We describe a new technique for reducing the number of nodes and symbols in automata based on tries. The technique stems from some results on anti-dictionaries for data compression and does not need to retain the input string, differently from other methods based on compact automata. The net effect is that of obtaining a lighter automaton than the directed acyclic word graph (DAWG) of Blumer et al., as it uses less nodes, still with arcs labeled by single characters.

automataComputer scienceSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]suffix tree0102 computer and information sciences02 engineering and technologyω-automaton01 natural sciencesindex text compressionlaw.inventionlawfactor and suffixTrie0202 electrical engineering electronic engineering information engineeringAutomata and formal languagesPattern matchingDirected acyclic word graphString (computer science)Directed graphDirected acyclic graphMobile automatonAutomaton010201 computation theory & mathematics020201 artificial intelligence & image processingAlgorithmComputer Science::Formal Languages and Automata Theory
researchProduct

If a DOL language is k-power free then it is circular

1993

We prove that if a DOL language is k-power free then it is circular. By using this result we are able to give an algorithm which decides whether, fixed an integer k≥1, a DOL language is k-power free; we are also able to give a new simpler proof of a result, previously obtained by Ehrenfeucht and Rozenberg, that states that it is decidable whether a DOL language is k-power free for some integer k≥1.

Discrete mathematicsPigeonhole principleFractional powerMathematicsInteger (computer science)Power (physics)Decidability
researchProduct

The Expressibility of Languages and Relations by Word Equations

1997

Classically, several properties and relations of words, such as being a power of a same word, can be expressed by using word equations. This paper is devoted to study in general the expressive power of word equations. As main results we prove theorems which allow us to show that certain properties of words are not expressible as components of solutions of word equations. In particular, the primitiveness and the equal length are such properties, as well as being any word over a proper subalphabet.

business.industryComputer scienceFormal languageComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Artificial intelligenceArithmeticbusinesscomputer.software_genrecomputerComputer Science::Formal Languages and Automata TheoryNatural language processingWord (computer architecture)
researchProduct

Abelian-Square-Rich Words

2017

An abelian square is the concatenation of two words that are anagrams of one another. A word of length $n$ can contain at most $\Theta(n^2)$ distinct factors, and there exist words of length $n$ containing $\Theta(n^2)$ distinct abelian-square factors, that is, distinct factors that are abelian squares. This motivates us to study infinite words such that the number of distinct abelian-square factors of length $n$ grows quadratically with $n$. More precisely, we say that an infinite word $w$ is {\it abelian-square-rich} if, for every $n$, every factor of $w$ of length $n$ contains, on average, a number of distinct abelian-square factors that is quadratic in $n$; and {\it uniformly abelian-sq…

FOS: Computer and information sciencesGeneral Computer ScienceDiscrete Mathematics (cs.DM)Formal Languages and Automata Theory (cs.FL)Abelian squareComputer Science - Formal Languages and Automata Theory0102 computer and information sciences02 engineering and technology68R1501 natural sciencesSquare (algebra)Theoretical Computer ScienceCombinatorics0202 electrical engineering electronic engineering information engineeringFOS: MathematicsMathematics - CombinatoricsAbelian groupQuotientMathematicsDiscrete mathematicsComputer Science (all)Sturmian wordSturmian wordFunction (mathematics)Thue–Morse word010201 computation theory & mathematicsBounded functionThue-Morse wordExponentAbelian square; Sturmian word; Thue-Morse word; Theoretical Computer Science; Computer Science (all)020201 artificial intelligence & image processingCombinatorics (math.CO)Word (group theory)Computer Science::Formal Languages and Automata TheoryComputer Science - Discrete Mathematics
researchProduct

STURMIAN WORDS AND AMBIGUOUS CONTEXT-FREE LANGUAGES

1990

If x is a rational number, 0&lt;x≤1, then A(x)c is a context-free language, where A(x) is the set of factors of the infinite Sturmian words with asymptotic density of 1’s smaller than or equal to x. We also prove a “gap” theorem i.e. A(x) can never be an unambiguous co-context-free language. The “gap” theorem is established by proving that the counting generating function of A(x) is transcendental. We show some links between Sturmian words, combinatorics and number theory.

CombinatoricsDiscrete mathematicsRational numberCombinatorics on wordsNumber theoryContext-free languageComputer Science (miscellaneous)Generating functionSturmian wordNatural densityTranscendental numberMathematicsInternational Journal of Foundations of Computer Science
researchProduct

On lazy representations and Sturmian graphs

2011

In this paper we establish a strong relationship between the set of lazy representations and the set of paths in a Sturmian graph associated with a real number α. We prove that for any non-negative integer i the unique path weighted i in the Sturmian graph associated with α represents the lazy representation of i in the Ostrowski numeration system associated with α. Moreover, we provide several properties of the representations of the natural integers in this numeration system.

Discrete mathematicsCombinatoricsOstrowski numerationIntegernumeration systems Sturmian graphs continued fractionsSettore INF/01 - InformaticaGraphMathematicsReal number
researchProduct

On the number of factors of Sturmian words

1991

Abstract We prove that for m ⩾1, card( A m ) = 1+∑ m i =1 ( m − i +1) ϕ ( i ) where A m is the set of factors of length m of all the Sturmian words and ϕ is the Euler function. This result was conjectured by Dulucq and Gouyou-Beauchamps (1987) who proved that this result implies that the language (∪ m ⩾0 A m ) c is inherently ambiguous. We also give a combinatorial version of the Riemann hypothesis.

Set (abstract data type)Euler functionCombinatoricssymbols.namesakeRiemann hypothesisGeneral Computer ScienceSturmian wordsymbolsComputer Science(all)Theoretical Computer ScienceMathematicsTheoretical Computer Science
researchProduct

From Nerode's congruence to Suffix Automata with mismatches

2009

AbstractIn this paper we focus on the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. As first result we give a characterization of the Nerode’s right-invariant congruence that is associated with Sk. This result generalizes the classical characterization described in [A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoretical Computer Science, 40, 1985, 31–55]. As second result we present an algorithm that makes use of Sk to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r of a text, where r is the…

General Computer ScienceOpen problem[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciences02 engineering and technologyString searching algorithm01 natural sciencesTheoretical Computer ScienceCombinatoricsDeterministic automatonSuffix automata0202 electrical engineering electronic engineering information engineeringCombinatorics on words Indexing Suffix Automata Languages with mismatches Approximate string matchingMathematicsDiscrete mathematicsCombinatorics on wordsApproximate string matchingSettore INF/01 - InformaticaLanguages with mismatchesComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)PrefixCombinatorics on wordsDeterministic finite automaton010201 computation theory & mathematicsSuffix automatonIndexing020201 artificial intelligence & image processingSuffixComputer Science::Formal Languages and Automata TheoryComputer Science(all)
researchProduct

Linear-size suffix tries

2016

Suffix trees are highly regarded data structures for text indexing and string algorithms [MCreight 76, Weiner 73]. For any given string w of length n = | w | , a suffix tree for w takes O ( n ) nodes and links. It is often presented as a compacted version of a suffix trie for w, where the latter is the trie (or digital search tree) built on the suffixes of w. Here the compaction process replaces each maximal chain of unary nodes with a single arc. For this, the suffix tree requires that the labels of its arcs are substrings encoded as pointers to w (or equivalent information). On the contrary, the arcs of the suffix trie are labeled by single symbols but there can be Θ ( n 2 ) nodes and lin…

Compressed suffix arrayGeneral Computer ScienceSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Generalized suffix tree0102 computer and information sciences02 engineering and technologyData_CODINGANDINFORMATIONTHEORYText indexing01 natural sciencesY-fast trielaw.inventionLongest common substring problemTheoretical Computer ScienceCombinatoricsSuffix treelawFactor and suffix automata0202 electrical engineering electronic engineering information engineeringData_FILESArithmeticFactor and suffix automata; Pattern matching; Suffix tree; Text indexing; Theoretical Computer Science; Computer Science (all)Pattern matchingMathematicsSettore INF/01 - InformaticaX-fast trieComputer Science (all)LCP array010201 computation theory & mathematics020201 artificial intelligence & image processingFM-index
researchProduct

On Sturmian Graphs

2007

AbstractIn this paper we define Sturmian graphs and we prove that all of them have a certain “counting” property. We show deep connections between this counting property and two conjectures, by Moser and by Zaremba, on the continued fraction expansion of real numbers. These graphs turn out to be the underlying graphs of compact directed acyclic word graphs of central Sturmian words. In order to prove this result, we give a characterization of the maximal repeats of central Sturmian words. We show also that, in analogy with the case of Sturmian words, these graphs converge to infinite ones.

Discrete mathematicsApplied MathematicsCDAWGsContinued fractionsSturmian wordSturmian wordsCharacterization (mathematics)RepeatsDirected acyclic graphCombinatoricsIndifference graphSturmian words CDAWGs Continued fractions RepeatsChordal graphComputer Science::Discrete MathematicsDiscrete Mathematics and CombinatoricsContinued fractionWord (group theory)Computer Science::Formal Languages and Automata TheoryReal numberMathematics
researchProduct

Generalizations of the periodicity Theorem of Fine and Wilf

2005

We provide three generalizations to the two-dimensional case of the well known periodicity theorem by Fine and Wilf [4] for strings (the one-dimensional case). The first and the second generalizations can be further extended to hold in the more general setting of Cayley graphs of groups. Weak forms of two of our results have been developed for the design of efficient algorithms for two-dimensional pattern matching [2, 3, 6].

Normal subgroupDiscrete mathematicsCombinatoricsVertex-transitive graphCayley graphEfficient algorithmPattern matchingMathematics
researchProduct

Sturmian graphs and integer representations over numeration systems

2012

AbstractIn this paper we consider a numeration system, originally due to Ostrowski, based on the continued fraction expansion of a real number α. We prove that this system has deep connections with the Sturmian graph associated with α. We provide several properties of the representations of the natural integers in this system. In particular, we prove that the set of lazy representations of the natural integers in this numeration system is regular if and only if the continued fraction expansion of α is eventually periodic. The main result of the paper is that for any number i the unique path weighted i in the Sturmian graph associated with α represents the lazy representation of i in the Ost…

Discrete mathematicsContinued fractionsApplied MathematicsNumeration systemsSturmian graphsGraphCombinatoricsOstrowski numerationIntegerIf and only ifnumeration systems Sturmian graphs continued fractions.Numeration systems; SUBWORD GRAPHS; WORDSDiscrete Mathematics and CombinatoricsSUBWORD GRAPHSContinued fractionWORDSMathematicsReal number
researchProduct

On the suffix automaton with mismatches

2007

International audience; In this paper we focus on the construction of the minimal deterministic finite automaton S_k that recognizes the set of suffixes of a word w up to k errors. We present an algorithm that makes use of S_k in order to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r, where r is the value of the repetition index of w. Moreover, we give some experimental results on some well-known words, like prefixes of Fibonacci and Thue-Morse words, and we make a conjecture on the size of the suffix automaton with mismatches.

approximate string matchingFibonacci numberlanguages with mismatches[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Generalized suffix treeBüchi automatonComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)0102 computer and information sciences02 engineering and technology01 natural sciencesCombinatoricsPrefixCombinatorics on wordsDeterministic finite automaton010201 computation theory & mathematics0202 electrical engineering electronic engineering information engineeringSuffix automaton020201 artificial intelligence & image processingsuffix automatacombinatorics on wordsComputer Science::Data Structures and Algorithmscombinatorics on words suffix automata languages with mismatches approximate string matchingWord (computer architecture)Computer Science::Formal Languages and Automata TheoryMathematics
researchProduct

Forbidden words in symbolic dynamics

2000

AbstractWe introduce an equivalence relation≃between functions from N to N. By describing a symbolic dynamical system in terms of forbidden words, we prove that the≃-equivalence class of the function that counts the minimal forbidden words of a system is a topological invariant of the system. We show that the new invariant is independent from previous ones, but it is not characteristic. In the case of sofic systems, we prove that the≃-equivalence of the corresponding functions is a decidable question. As a more special application, we show, by using the new invariant, that two systems associated to Sturmian words having “different slope” are not conjugate.

Discrete mathematicsApplied Mathematicsautomata and formal languages010102 general mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Symbolic dynamics[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciencesFunction (mathematics)16. Peace & justice01 natural sciencesDecidabilitysymbolic dynamics010201 computation theory & mathematicsEquivalence relationcombinatoric on words0101 mathematicsInvariant (mathematics)Dynamical system (definition)Equivalence (measure theory)Computer Science::Formal Languages and Automata TheoryWord (group theory)ComputingMilieux_MISCELLANEOUSMathematics
researchProduct

Words and forbidden factors

2002

AbstractGiven a finite or infinite word v, we consider the set M(v) of minimal forbidden factors of v. We show that the set M(v) is of fundamental importance in determining the structure of the word v. In the case of a finite word w we consider two parameters that are related to the size of M(w): the first counts the minimal forbidden factors of w and the second gives the length of the longest minimal forbidden factor of w. We derive sharp upper and lower bounds for both parameters. We prove also that the second parameter is related to the minimal period of the word w. We are further interested to the algorithmic point of view. Indeed, we design linear time algorithm for the following two p…

CombinatoricsGeneral Computer ScienceGeneral problemFree monoidFormal languageSturmian wordWord problem (mathematics)AutomorphismTime complexityUpper and lower boundsMathematicsTheoretical Computer ScienceComputer Science(all)Theoretical Computer Science
researchProduct

On a Conjecture on Bidimensional Words

2003

We prove that, given a double sequence w over the alphabet A (i.e. a mapping from Z2 to A), if there exists a pair (n0, m0) ∈ Z2 such that pw(n0, m0) < 1/100n0m0, then w has a periodicity vector, where pw is the complexity function in rectangles of w.

Discrete mathematicsConjectureGeneral Computer ScienceExistential quantificationTheoretical Computer ScienceCombinatoricsCombinatorics on wordsFormal languageComplexity functionPattern matchingAlphabetDouble sequenceComputer Science(all)Mathematics
researchProduct

Abelian Powers and Repetitions in Sturmian Words

2016

Richomme, Saari and Zamboni (J. Lond. Math. Soc. 83: 79-95, 2011) proved that at every position of a Sturmian word starts an abelian power of exponent $k$ for every $k > 0$. We improve on this result by studying the maximum exponents of abelian powers and abelian repetitions (an abelian repetition is an analogue of a fractional power) in Sturmian words. We give a formula for computing the maximum exponent of an abelian power of abelian period $m$ starting at a given position in any Sturmian word of rotation angle $\alpha$. vAs an analogue of the critical exponent, we introduce the abelian critical exponent $A(s_\alpha)$ of a Sturmian word $s_\alpha$ of angle $\alpha$ as the quantity $A(s_\a…

FOS: Computer and information sciencesFibonacci numberGeneral Computer ScienceDiscrete Mathematics (cs.DM)Formal Languages and Automata Theory (cs.FL)[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Computer Science - Formal Languages and Automata Theory0102 computer and information sciences01 natural sciencesTheoretical Computer ScienceCombinatoricsFOS: MathematicsMathematics - Combinatorics[INFO]Computer Science [cs]Number Theory (math.NT)0101 mathematicsAbelian groupContinued fractionFibonacci wordComputingMilieux_MISCELLANEOUSQuotientMathematicsMathematics - Number Theoryta111010102 general mathematicsComputer Science (all)Sturmian wordSturmian wordAbelian period; Abelian power; Critical exponent; Lagrange constant; Sturmian word; Theoretical Computer Science; Computer Science (all)Abelian periodLagrange constantCritical exponentAbelian power010201 computation theory & mathematicsBounded functionExponentCombinatorics (math.CO)Computer Science::Formal Languages and Automata TheoryComputer Science - Discrete Mathematics
researchProduct

Sturmian Graphs and a conjecture of Moser

2004

In this paper we define Sturmian graphs and we prove that all of them have a “counting” property. We show deep connections between this counting property and two conjectures, by Moser and by Zaremba, on the continued fraction expansion of real numbers. These graphs turn out to be the underlying graphs of CDAWGs of central Sturmian words. We show also that, analogously to the case of Sturmian words, these graphs converge to infinite ones.

Discrete mathematicsConjectureProperty (philosophy)Data structuresData structureCombinatoricsPhilosophy of languagecompressed suffixComputer Science::Discrete MathematicsContinued fractionComputer Science::Formal Languages and Automata TheoryAlgorithmsReal numberMathematics
researchProduct

On Fine and Wilf's theorem for bidimensional words

2003

AbstractGeneralizations of Fine and Wilf's Periodicity Theorem are obtained for the case of bidimensional words using geometric arguments. The domains considered constitute a large class of convex subsets of R2 which include most parallelograms. A complete discussion is provided for the parallelogram case.

CombinatoricsLarge classDiscrete mathematicsGeneral Computer ScienceGeneralizationRegular polygonParallelogramWord (group theory)MathematicsTheoretical Computer ScienceComputer Science(all)Theoretical Computer Science
researchProduct

Automated Synthesis of Application-layer Connectors from Automata-based Specifications

2019

Abstract Ubiquitous and Pervasive Computing, and the Internet of Things, promote dynamic interaction among heterogeneous systems. To achieve this vision, interoperability among heterogeneous systems represents a key enabler, and mediators are often built to solve protocol mismatches. Many approaches propose the synthesis of mediators. Unfortunately, a rigorous characterization of the concept of interoperability is still lacking, hence making hard to assess their applicability and soundness. In this paper, we provide a framework for the synthesis of mediators that allows us to: (i) characterize the conditions for the mediator existence and correctness; and (ii) establish the applicability bo…

CorrectnessUbiquitous computingGeneral Computer ScienceComputer Networks and CommunicationsComputer scienceDistributed computingInteroperability0102 computer and information sciences02 engineering and technology01 natural sciencesHeterogeneous ApplicationsTheoretical Computer Science020204 information systems0202 electrical engineering electronic engineering information engineeringProtocol MismatchesCommunication & CoordinationProtocol (object-oriented programming)Automated Mediator SynthesisSoundnessApplied MathematicsAutomated Mediator Synthesis Interoperability Protocols Heterogeneous Applications Communication & Coordination Protocol MismatchesInteroperabilityApplication layerAutomatonComputational Theory and Mathematics010201 computation theory & mathematicsKey (cryptography)Protocols
researchProduct

A multidimensional critical factorization theorem

2005

AbstractThe Critical Factorization Theorem is one of the principal results in combinatorics on words. It relates local periodicities of a word to its global periodicity. In this paper we give a multidimensional extension of it. More precisely, we give a new proof of the Critical Factorization Theorem, but in a weak form, where the weakness is due to the fact that we loose the tightness of the local repetition order. In exchange, we gain the possibility of extending our proof to the multidimensional case. Indeed, this new proof makes use of the Theorem of Fine and Wilf, that has several classical generalizations to the multidimensional case.

PeriodicityGeneral Computer ScienceRepetition (rhetorical device)Combinatorics on wordsExtension (predicate logic)Bruck–Ryser–Chowla theoremTheoretical Computer ScienceAlgebrasymbols.namesakeCombinatorics on wordsFactorizationMultidimensional wordsWeierstrass factorization theoremsymbolsOrder (group theory)Word (computer architecture)MathematicsComputer Science(all)Theoretical Computer Science
researchProduct

Languages with mismatches and an application to approximate indexing

2005

In this paper we describe a factorial language, denoted by L(S, k,r), that contains all words that occur in a string 5 up to k mismatches every r symbols. Then we give some combinatorial properties of a parameter, called repetition index and denoted by R(S,k,r), defined as the smallest integer h ? 1 such that all strings of this length occur at most in a unique position of the text S up to k mismatches every r symbols. We prove that R(S, k, r) is a non-increasing function of r and a non-decreasing function of k and that the equation r = R(S, k, r) admits a unique solution. The repetition index plays an important role in the construction of an indexing data structure based on a trie that rep…

FactorialCombinatorics on wordsString (computer science)Function (mathematics)formal languagesmatching indexingCombinatoricsCombinatorics on wordsIntegerapproximate stringPosition (vector)TrieAlgorithmWord (group theory)Mathematics
researchProduct

Word assembly through minimal forbidden words

2006

AbstractWe give a linear-time algorithm to reconstruct a finite word w over a finite alphabet A of constant size starting from a finite set of factors of w verifying a suitable hypothesis. We use combinatorics techniques based on the minimal forbidden words, which have been introduced in previous papers. This improves a previous algorithm which worked under the assumption of stronger hypothesis.

General Computer ScienceFragment assemblyFactor automaton[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS][INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciences02 engineering and technology01 natural sciencesMinimal forbidden wordTheoretical Computer ScienceCombinatorics0202 electrical engineering electronic engineering information engineeringFinite setComputingMilieux_MISCELLANEOUSCombinatorics on wordMathematicsShortest superstringCombinatorics on wordsRepetition index16. Peace & justice010201 computation theory & mathematics020201 artificial intelligence & image processingAlphabetConstant (mathematics)Word (computer architecture)Computer Science::Formal Languages and Automata TheoryComputer Science(all)
researchProduct

Minimal forbidden words and symbolic dynamics

1996

We introduce a new complexity measure of a factorial formal language L: the growth rate of the set of minimal forbidden words. We prove some combinatorial properties of minimal forbidden words. As main result we prove that the growth rate of the set of minimal forbidden words for L is a topological invariant of the dynamical system defined by L.

Discrete mathematicsFactorial010102 general mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Symbolic dynamicsComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]0102 computer and information sciencesInvariant (physics)16. Peace & justice01 natural sciencesCombinatorics010201 computation theory & mathematicsTheoryofComputation_LOGICSANDMEANINGSOFPROGRAMSInformation complexityFormal language0101 mathematicsComputer Science::Formal Languages and Automata TheoryComputingMilieux_MISCELLANEOUSMathematicsofComputing_DISCRETEMATHEMATICSMathematics
researchProduct

Fine and Wilf's Theorem for Three periods and a Generalization of Sturmian Words

1999

AbstractWe extend the theorem of Fine and Wilf to words having three periods. We then define the set 3-PER of words of maximal length for which such result does not apply. We prove that the set 3-PER and the sequences of complexity 2n + 1, introduced by Arnoux and Rauzy to generalize Sturmian words, have the same set of factors.

Discrete mathematicsPeriodicityEuclid's algorithmCombinatorics on wordsGeneral Computer ScienceGeneralizationSturmian wordSturmian wordsTheoretical Computer ScienceCombinatoricsSet (abstract data type)Combinatorics on wordsWord lengthComputer Science(all)Mathematics
researchProduct

On the longest common factor problem

2008

The Longest Common Factor (LCF) of a set of strings is a well studied problem having a wide range of applications in Bioinformatics: from microarrays to DNA sequences analysis. This problem has been solved by Hui (2000) who uses a famous constant-time solution to the Lowest Common Ancestor (LCA) problem in trees coupled with use of suffix trees. A data structure for the LCA problem, although linear in space and construction time, introduces a multiplicative constant in both space and time that reduces the range of applications in many biological applications. In this article we present a new method for solving the LCF problem using the suffix tree structure with an auxiliary array that take…

Discrete mathematicsSettore INF/01 - InformaticaSuffix tree[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS]Generalized suffix treeDAWGsuffix tree[INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]Data structureLongest common substring problemlaw.inventionCombinatoricsSet (abstract data type)Range (mathematics)lawLongest Common Factor ProblemSuffixLowest common ancestorMathematics
researchProduct

Forbidden Factors and Fragment Assembly

2002

In this paper we approach the fragment assembly problem by using the notion of minimal forbidden factors introduced in previous paper. Denoting by M(w) the set of minimal forbidden factors of a word w, we first focus on the evaluation of the size of elements in M(w) and on designing of an algorithm to recover the word w from M(w). Actually we prove that for a word w randomly generated by a memoryless source with identical symbol probabilities, the maximal length m(w) of words in M(w) is logarithmic and that the reconstruction algorithm runs in linear time. These results have an interesting application to the fragment assembly problem, i.e. reconstruct a word w from a given set I of substrin…

Set (abstract data type)CombinatoricsLogarithmFragment (logic)Reconstruction algorithmFocus (optics)AlgorithmTime complexitySubstringWord (computer architecture)Mathematics
researchProduct

Abelian Repetitions in Sturmian Words

2012

We investigate abelian repetitions in Sturmian words. We exploit a bijection between factors of Sturmian words and subintervals of the unitary segment that allows us to study the periods of abelian repetitions by using classical results of elementary Number Theory. We prove that in any Sturmian word the superior limit of the ratio between the maximal exponent of an abelian repetition of period $m$ and $m$ is a number $\geq\sqrt{5}$, and the equality holds for the Fibonacci infinite word. We further prove that the longest prefix of the Fibonacci infinite word that is an abelian repetition of period $F_j$, $j&gt;1$, has length $F_j(F_{j+1}+F_{j-1} +1)-2$ if $j$ is even or $F_j(F_{j+1}+F_{j-1}…

FOS: Computer and information sciencesFibonacci numberDiscrete Mathematics (cs.DM)Formal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheoryG.2.168R15FOS: MathematicsCombinatorics on words Sturmian wordMathematics - CombinatoricsAbelian groupFibonacci wordMathematicsDiscrete mathematicsMathematics::CombinatoricsSturmian wordCombinatorics on wordsNumber theoryF.2.2; F.4.3; G.2.1F.4.3ExponentCombinatorics (math.CO)F.2.2Word (group theory)Computer Science::Formal Languages and Automata TheoryComputer Science - Discrete Mathematics
researchProduct

Characteristic Sturmian words are extremal for the Critical Factorization Theorem

2012

We prove that characteristic Sturmian words are extremal for the Critical Factorization Theorem (CFT) in the following sense. If p x ( n ) denotes the local period of an infinite word x at point n , we prove that x is a characteristic Sturmian word if and only if p x ( n ) is smaller than or equal to n + 1 for all n ≥ 1 and it is equal to n + 1 for infinitely many integers n . This result is extremal with respect to the \{CFT\} since a consequence of the \{CFT\} is that, for any infinite recurrent word x, either the function p x is bounded, and in such a case x is periodic, or p x ( n ) ≥ n + 1 for infinitely many integers n . As a byproduct of the techniques used in the paper we extend a r…

Critical Factorization TheoremDiscrete mathematicsPeriodicitySettore INF/01 - InformaticaCombinatorics on wordsGeneral Computer ScienceSturmian wordSturmian wordsFunction (mathematics)Critical point (mathematics)Theoretical Computer ScienceCombinatoricsCombinatorics on wordssymbols.namesakeBounded functionWeierstrass factorization theoremsymbolsFibonacci wordWord (group theory)MathematicsComputer Science(all)Theoretical Computer Science
researchProduct

On the number of Arnoux–Rauzy words

2002

Discrete mathematicsAlgebraAlgebra and Number TheoryMathematicsActa Arithmetica
researchProduct

Minimal forbidden patterns of multi-dimensional shifts

2005

We study whether the entropy (or growth rate) of minimal forbidden patterns of symbolic dynamical shifts of dimension 2 or more, is a conjugacy invariant. We prove that the entropy of minimal forbidden patterns is a conjugacy invariant for uniformly semi-strongly irreducible shifts. We prove a weaker invariant in the general case.

General Mathematics[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS][INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS]020206 networking & telecommunications0102 computer and information sciences02 engineering and technology01 natural sciencesCombinatoricsConjugacy class010201 computation theory & mathematics0202 electrical engineering electronic engineering information engineeringMulti dimensionalComputingMilieux_MISCELLANEOUSMathematics
researchProduct

Words with the Maximum Number of Abelian Squares

2015

An abelian square is the concatenation of two words that are anagrams of one another. A word of length n can contain \(\varTheta (n^2)\) distinct factors that are abelian squares. We study infinite words such that the number of abelian square factors of length n grows quadratically with n.

Quadratic growthComputer Science (all)ConcatenationComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Computer Science (all); Theoretical Computer ScienceSquare (algebra)Theoretical Computer ScienceCombinatoricsAnagramsIrrational numberGolden ratioAbelian groupComputer Science::Formal Languages and Automata TheoryWord (group theory)Mathematics
researchProduct

Fragment assembly through minimal forbidden words

2004

researchProduct

On numeration systems and Sturmian graphs

2005

researchProduct

Approximate string matching: indexing and the k-mismatch problem

2004

researchProduct