Search results for " Natural Language"
showing 10 items of 192 documents
Asymptotic bit frequency in Fibonacci words
2021
It is known that binary words containing no $k$ consecutive 1s are enumerated by $k$-step Fibonacci numbers. In this note we discuss the expected value of a random bit in a random word of length $n$ having this property.
On the suffix automaton with mismatches
2007
International audience; In this paper we focus on the construction of the minimal deterministic finite automaton S_k that recognizes the set of suffixes of a word w up to k errors. We present an algorithm that makes use of S_k in order to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r, where r is the value of the repetition index of w. Moreover, we give some experimental results on some well-known words, like prefixes of Fibonacci and Thue-Morse words, and we make a conjecture on the size of the suffix automaton with mismatches.
Pattern languages with and without erasing
1994
The paper deals with the problems related to finding a pattern common to all words in a given set. We restrict our attention to patterns expressible by the use of variables ranging over words. Two essentially different cases result, depending on whether or not the empty word belongs to the range. We investigate equivalence and inclusion problems, patterns descriptive for a set, as well as some complexity issues. The inclusion problem between two pattern languages turns out to be of fundamental theoretical importance because many problems in the classical combinatorics of words can be reduced to it.
The Expressibility of Languages and Relations by Word Equations
1997
Classically, several properties and relations of words, such as being a power of a same word, can be expressed by using word equations. This paper is devoted to study in general the expressive power of word equations. As main results we prove theorems which allow us to show that certain properties of words are not expressible as components of solutions of word equations. In particular, the primitiveness and the equal length are such properties, as well as being any word over a proper subalphabet.
Some applications of a theorem of Shirshov to language theory
1983
Some applications of a theorem of Shirshov to language theory are given: characterization of regular languages, characterization of bounded languages, and a sufficient condition for a language to be Parikh-bounded.
A word prediction methodology for automatic sentence completion
2015
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…
BALANCE PROPERTIES AND DISTRIBUTION OF SQUARES IN CIRCULAR WORDS
2010
We study balance properties of circular words over alphabets of size greater than two. We give some new characterizations of balanced words connected to the Kawasaki-Ising model and to the notion of derivative of a word. Moreover we consider two different generalizations of the notion of balance, and we find some relations between them. Some of our results can be generalized to non periodic infinite words as well.
The Syllogistic with Unity
2011
We extend the language of the classical syllogisms with the sentence-forms “At most 1 p is a q” and “More than 1 p is a q”. We show that the resulting logic does not admit a finite set of syllogism-like rules whose associated derivation relation is sound and complete, even when reductio ad absurdum is allowed.
Where lol is: function and position of lol used as a discourse marker in YouTube comments
2020
Lol is probably one of the most popular words in computer-mediated communication. It is generally taken to be the acronym of “laughing out loud”, but it is not always used to indicate a humorous response; rather, it is multifunctional. Drawing on previous studies of the different functions of lol, this paper explores a possible correlation between the position and function of non-lexicalized lol in the specific context of YouTube comments. The hypothesis is that the function of lol largely depends on its position: clause-initial lol is not used with the same functions as clause-final lol. The data for the study come from the comment threads of three popular YouTube videos posted in 2017, 20…
ON-LINE CONSTRUCTION OF A SMALL AUTOMATON FOR A FINITE SET OF WORDS
2012
In this paper we describe a "light" algorithm for the on-line construction of a small automaton recognising a finite set of words. The algorithm runs in linear time. We carried out good experimental results on real dictionaries, on biological sequences and on the sets of suffixes (resp. factors) of a set of words that shows how our automaton is near to the minimal one. For the suffixes of a text, we propose a modified construction that leads to an even smaller automaton. We moreover construct linear algorithms for the insertion and deletion of a word in a finite set, directly from the constructed automaton.