Search results for "Natural language"
showing 10 items of 650 documents
Using discourse segmentation to account for the polyfunctionality of discourse markers:The case of well
2021
Abstract A large number of studies describe the many different functions of polyfunctional discourse markers like well in different contexts and from different theoretical perspectives. In the current paper, we propose to systematize the many different uses identified based on their position with respect to the discourse units they are associated with. Not only can previous findings on well be integrated into a single coherent representation of its uses and functions, but the positions with respect to the discourse units can also be associated with specific functions, thus shedding light on how the polyfunctionality of well is brought about.
On the Shuffle of Star-Free Languages
2012
Motivated by the general problem to characterize families of languages closed under shuffle, we investigate some conditions under which the shuffle of two star-free languages is star-free. Some of the special cases here approached give rise to new problems in combinatorics on words.
Verbal sets and cyclic coverings
2010
Abstract We consider groups G such that the set of all values of a fixed word w in G is covered by a finite set of cyclic subgroups. Fernandez-Alcober and Shumyatsky studied such groups in the case when w is the word [ x 1 , x 2 ] , and proved that in this case the corresponding verbal subgroup G ′ is either cyclic or finite. Answering a question asked by them, we show that this is far from being the general rule. However, we prove a weaker form of their result in the case when w is either a lower commutator word or a non-commutator word, showing that in the given hypothesis the verbal subgroup w ( G ) must be finite-by-cyclic. Even this weaker conclusion is not universally valid: it fails …
Combinatorics of Finite Words and Suffix Automata
2009
The suffix automaton of a finite word is the minimal deterministic automaton accepting the language of its suffixes. The states of the suffix automaton are the classes of an equivalence relation defined on the set of factors. We explore the relationship between the combinatorial properties of a finite word and the structural properties of its suffix automaton. We give formulas for expressing the total number of states and the total number of edges of the suffix automaton in terms of special factors of the word.
Universal Lyndon Words
2014
A word w over an alphabet Σ is a Lyndon word if there exists an order defined on Σ for which w is lexicographically smaller than all of its conjugates (other than itself). We introduce and study universal Lyndon words, which are words over an n-letter alphabet that have length n! and such that all the conjugates are Lyndon words. We show that universal Lyndon words exist for every n and exhibit combinatorial and structural properties of these words. We then define particular prefix codes, which we call Hamiltonian lex-codes, and show that every Hamiltonian lex-code is in bijection with the set of the shortest unrepeated prefixes of the conjugates of a universal Lyndon word. This allows us t…
Minimal forbidden words and symbolic dynamics
1996
We introduce a new complexity measure of a factorial formal language L: the growth rate of the set of minimal forbidden words. We prove some combinatorial properties of minimal forbidden words. As main result we prove that the growth rate of the set of minimal forbidden words for L is a topological invariant of the dynamical system defined by L.
ON THE STAR HEIGHT OF RATIONAL LANGUAGES
1994
Two problems concerning the star height of a rational language are investigated: the star height one problem and the relationships between the unambiguity of an expression and its star height. For this purpose we consider the class of factorial, transitive and rational (FTR) languages. From the algebraic point of view a FTR language is the set of factors of a rational submonoid M. Two subclasses of FTR languages are introduced: renewal languages, corresponding to the case of M finitely generated, and unambiguous renewal languages, corresponding to the case of M finitely generated and free. We prove that a FTR language has star height one if and only if it is renewal. This gives a simple de…
On block pumpable languages
2016
Ehrenfeucht, Parikh and Rozenberg gave an interesting characterisation of the regular languages called the block pumping property. When requiring this property only with respect to members of the language but not with respect to nonmembers, one gets the notion of block pumpable languages. It is shown that these block pumpable are a more general concept than regular languages and that they are an interesting notion of their own: they are closed under intersection, union and homomorphism by transducers; they admit multiple pumping; they have either polynomial or exponential growth.
Balancing and clustering of words in the Burrows–Wheeler transform
2011
AbstractCompression algorithms based on Burrows–Wheeler transform (BWT) take advantage of the fact that the word output of BWT shows a local similarity and then turns out to be highly compressible. The aim of the present paper is to study such “clustering effect” by using notions and methods from Combinatorics on Words.The notion of balance of a word plays a central role in our investigation. Empirical observations suggest that balance is actually the combinatorial property of input word that ensure optimal BWT compression. Moreover, it is reasonable to assume that the more balanced the input word is, the more local similarity we have after BWT (and therefore the better the compression is).…
Automata and differentiable words
2011
We exhibit the construction of a deterministic automaton that, given k > 0, recognizes the (regular) language of k-differentiable words. Our approach follows a scheme of Crochemore et al. based on minimal forbidden words. We extend this construction to the case of C\infinity-words, i.e., words differentiable arbitrary many times. We thus obtain an infinite automaton for representing the set of C\infinity-words. We derive a classification of C\infinity-words induced by the structure of the automaton. Then, we introduce a new framework for dealing with \infinity-words, based on a three letter alphabet. This allows us to define a compacted version of the automaton, that we use to prove that ev…