Search results for " Natural Language"
showing 10 items of 192 documents
An LP-based hyperparameter optimization model for language modeling
2018
In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find per…
Pattern statistics in faro words and permutations
2021
We study the distribution and the popularity of some patterns in $k$-ary faro words, i.e. words over the alphabet $\{1, 2, \ldots, k\}$ obtained by interlacing the letters of two nondecreasing words of lengths differing by at most one. We present a bijection between these words and dispersed Dyck paths (i.e. Motzkin paths with all level steps on the $x$-axis) with a given number of peaks. We show how the bijection maps statistics of consecutive patterns of faro words into linear combinations of other pattern statistics on paths. Then, we deduce enumerative results by providing multivariate generating functions for the distribution and the popularity of patterns of length at most three. Fina…
On prefix normal words and prefix normal forms
2016
A $1$-prefix normal word is a binary word with the property that no factor has more $1$s than the prefix of the same length; a $0$-prefix normal word is defined analogously. These words arise in the context of indexed binary jumbled pattern matching, where the aim is to decide whether a word has a factor with a given number of $1$s and $0$s (a given Parikh vector). Each binary word has an associated set of Parikh vectors of the factors of the word. Using prefix normal words, we provide a characterization of the equivalence class of binary words having the same set of Parikh vectors of their factors. We prove that the language of prefix normal words is not context-free and is strictly contai…
Open and Closed Prefixes of Sturmian Words
2013
A word is closed if it contains a proper factor that occurs both as a prefix and as a suffix but does not have internal occurrences, otherwise it is open. We deal with the sequence of open and closed prefixes of Sturmian words and prove that this sequence characterizes every finite or infinite Sturmian word up to isomorphisms of the alphabet. We then characterize the combinatorial structure of the sequence of open and closed prefixes of standard Sturmian words. We prove that every standard Sturmian word, after swapping its first letter, can be written as an infinite product of squares of reversed standard words.
Minimal forbidden factors of circular words
2017
Minimal forbidden factors are a useful tool for investigating properties of words and languages. Two factorial languages are distinct if and only if they have different (antifactorial) sets of minimal forbidden factors. There exist algorithms for computing the minimal forbidden factors of a word, as well as of a regular factorial language. Conversely, Crochemore et al. [IPL, 1998] gave an algorithm that, given the trie recognizing a finite antifactorial language $M$, computes a DFA recognizing the language whose set of minimal forbidden factors is $M$. In the same paper, they showed that the obtained DFA is minimal if the input trie recognizes the minimal forbidden factors of a single word.…
RECOGNIZABLE PICTURE LANGUAGES
1992
The purpose of this paper is to propose a new notion of recognizability for picture (two-dimensional) languages extending the characterization of one-dimensional recognizable languages in terms of local languages and alphabetic mappings. We first introduce the family of local picture languages (denoted by LOC) and, in particular, prove the undecidability of the emptiness problem. Then we define the new family of recognizable picture languages (denoted by REC). We study some combinatorial and language theoretic properties of REC such as ambiguity, closure properties or undecidability results. Finally we compare the family REC with the classical families of languages recognized by four-way a…
From Nerode's congruence to Suffix Automata with mismatches
2009
AbstractIn this paper we focus on the minimal deterministic finite automaton Sk that recognizes the set of suffixes of a word w up to k errors. As first result we give a characterization of the Nerode’s right-invariant congruence that is associated with Sk. This result generalizes the classical characterization described in [A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M. Chen, J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoretical Computer Science, 40, 1985, 31–55]. As second result we present an algorithm that makes use of Sk to accept in an efficient way the language of all suffixes of w up to k errors in every window of size r of a text, where r is the…
"Table 24" of "Search for magnetic monopoles and stable high-electric-charge objects in 13 TeV proton-proton collisions with the ATLAS detector"
2019
Total selection efficiency (i.e., the fraction of MC HECOs surviving the trigger and offline selection criteria) as a function of transverse kinetic energy $E^\text{kin}_\text{T}=E_\text{kin}\sin\theta$ and pseudorapidity $|\eta|$ for HECOs of charge $|z|=20$ of mass 1500 GeV.
"Table 47" of "Search for magnetic monopoles and stable high-electric-charge objects in 13 TeV proton-proton collisions with the ATLAS detector"
2019
Total selection efficiency (i.e., the fraction of MC HECOs surviving the trigger and offline selection criteria) as a function of transverse kinetic energy $E^\text{kin}_\text{T}=E_\text{kin}\sin\theta$ and pseudorapidity $|\eta|$ for HECOs of charge $|z|=80$ of mass 1000 GeV.
"Table 34" of "Search for magnetic monopoles and stable high-electric-charge objects in 13 TeV proton-proton collisions with the ATLAS detector"
2019
Total selection efficiency (i.e., the fraction of MC HECOs surviving the trigger and offline selection criteria) as a function of transverse kinetic energy $E^\text{kin}_\text{T}=E_\text{kin}\sin\theta$ and pseudorapidity $|\eta|$ for HECOs of charge $|z|=40$ of mass 2500 GeV.