6533b837fe1ef96bd12a2974
RESEARCH PRODUCT
On prefix normal words and prefix normal forms
Joe SawadaPéter BurcsiZsuzsanna LiptákGabriele FiciFrank Ruskeysubject
FOS: Computer and information sciencesPrefix codePrefix normal wordPre-necklaceDiscrete Mathematics (cs.DM)General Computer ScienceFormal Languages and Automata Theory (cs.FL)Binary numberComputer Science - Formal Languages and Automata TheoryContext (language use)Binary languageLyndon words0102 computer and information sciences02 engineering and technologyPrefix grammarprefix normal formsKraft's inequalityCharacterization (mathematics)Lyndon word01 natural sciencesPrefix normal formenumerationTheoretical Computer ScienceFOS: Mathematics0202 electrical engineering electronic engineering information engineeringMathematics - CombinatoricsMathematicsDiscrete mathematicsprefix normal words prefix normal forms binary languages binary jumbled pattern matching pre-necklaces Lyndon words enumerationbinary jumbled pattern matchingSettore INF/01 - InformaticaComputer Science (all)pre-necklacesComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)prefix normal wordsPrefix010201 computation theory & mathematics020201 artificial intelligence & image processingCombinatorics (math.CO)binary languagesComputer Science::Formal Languages and Automata TheoryWord (group theory)Computer Science - Discrete Mathematicsdescription
A $1$-prefix normal word is a binary word with the property that no factor has more $1$s than the prefix of the same length; a $0$-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of $1$s and $0$s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We give enumeration results on $\textit{pnw}(n)$, the number of prefix normal words of length $n$, showing that, for sufficiently large $n$, \[ 2^{n-4 \sqrt{n \lg n}} \le \textit{pnw}(n) \le 2^{n - \lg n + 1}. \] For fixed density (number of $1$s), we show that the ordinary generating function of the number of prefix normal words of length $n$ and density $d$ is a rational function. Finally, we give experimental results on $\textit{pnw}(n)$, discuss further properties, and state open problems.
year | journal | country | edition | language |
---|---|---|---|---|
2016-11-28 | Theoretical Computer Science |