Search results for "Suffix"
showing 10 items of 75 documents
Suffixes, Conjugates and Lyndon Words
2013
In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by $\plex$ and $\pom$, that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the $\pom$ order between two primitive wo…
r-Indexing the eBWT
2021
The extended Burrows Wheeler Transform (\(\mathrm {eBWT}\)) was introduced by Mantaci et al. [TCS 2007] to extend the definition of the \(\mathrm {BWT}\) to a collection of strings. In our prior work [SPIRE 2021], we give a linear-time algorithm for the \(\mathrm {eBWT}\) that preserves the fundamental property of the original definition (i.e., the independence from the input order). The algorithm combines a modification of the Suffix Array Induced Sorting (SAIS) algorithm [IEEE Trans Comput 2011] with Prefix Free Parsing [AMB 2019; JCB 2020]. In this paper, we show how this construction algorithm leads to r-indexing the \(\mathrm {eBWT}\), i.e., run-length encoded \(\mathrm {eBWT}\) and \(…
Suffix Automata and Standard Sturmian Words
2007
Blumer et al. showed (cf. [3,2]) that the suffix automaton of a word w must have at least |w|+1 states and at most 2|w|-1 states. In this paper we characterize the language L of all binary words w whose minimal suffix automaton S(w) has exactly |w| + 1 states; they are precisely all prefixes of standard Sturmian words. In particular, we give an explicit construction of suffix automaton of words that are palindromic prefixes of standard words. Moreover, we establish a necessary and sufficient condition on S(w) which ensures that if w ∈ L and a ∈ {0, 1} then wa ∈ L. By using such a condition, we show how to construct the automaton S(wa) from S(w). More generally, we provide a simple construct…
Quantifying English and Polish Lolitas: A Corpus-Driven Stylistic Comparison
2013
The study presented in this article, which is a fragment of a larger study of translational and non- translational texts (Grabowski 2012), falls within the scope of descriptive translation studies (DTS) and corpus linguistics, with particular emphasis on the study of translation universals, on the example of English-original (written in 1955) and two independent Polish translations of the novel Lolita by V. Nabokov (by Stiller in 1991 and Klobukowski in 1997). According to Baker (1995: 243), universal features of translation or translation universals, constitute specific textual characteristics (e.g. lexical, grammatical or stylistic) typical of translated texts, irrespective of languages i…
La sufixació en la premsa valenciana del segle XIX
2018
El Mole, una de les primeres i més rellevants manifestacions periodístiques en llengua catalana, va partir d’un model lingüístic popular i modern, basat en la literatura popularista contemporània, que els redactors van procurar combinar amb un registre més formal segons el tipus d’article. L’actualitat del periòdic els duu a reflectir el llenguatge de l’època, però també a practicar la formació de noves paraules a través dels recursos interns de l’idioma, com la derivació. En aquest article estudiem els sufixos que apareixen en El Mole, i veiem com expressen en part la realitat lingüística del català del segle XIX en la seua varietat valenciana, i en part la creativitat dels…
Approximation through suffixation:-ḍḍu/-a in Sicilian
2023
This article investigates the use of the Sicilian suffix -ḍḍu/-afor the expression of ap-proximation. On the basis of a survey of a corpus of ethnotexts and the outputs of a translation questionnaire, we propose that approximation is a core value in the semantic network of the suffix, expressing a certain distance fromthe default values conveyed by the base. In terms of prototypi-cality, this distance may occur from the categorial centre (internal approximation): the suffix mod-ifies the semantics of the base but does not alter the categorial status of the referent. Alternatively, the suffix may impact on categorial membership tout court (external approximation), questioning the categorial …
Suffix array and Lyndon factorization of a text
2014
Abstract The main goal of this paper is to highlight the relationship between the suffix array of a text and its Lyndon factorization. It is proved in [15] that one can obtain the Lyndon factorization of a text from its suffix array. Conversely, here we show a new method for constructing the suffix array of a text that takes advantage of its Lyndon factorization. The surprising consequence of our results is that, in order to construct the suffix array, the local suffixes inside each Lyndon factor can be separately processed, allowing different implementative scenarios, such as online, external and internal memory, or parallel implementations. Based on our results, the algorithm that we prop…
Special factors and the combinatorics of suffix and factor automata
2011
AbstractThe suffix automaton (resp. factor automaton) of a finite word w is the minimal deterministic automaton recognizing the set of suffixes (resp. factors) of w. We study the relationships between the structure of the suffix and factor automata and classical combinatorial parameters related to the special factors of w. We derive formulae for the number of states of these automata. We also characterize the languages LSA and LFA of words having respectively suffix automaton and factor automaton with the minimal possible number of states.
Spatiotemporal Dynamics of the Processing of Spoken Inflected and Derived Words:A Combined EEG and MEG Study
2011
The spatiotemporal dynamics of the neural processing of spoken morphologically complex words are still an open issue. In the current study, we investigated the time course and neural sources of spoken inflected and derived words using simultaneously recorded electroencephalography (EEG) and magnetoencephalography (MEG) responses. Ten participants (native speakers) listened to inflected, derived, and monomorphemic Finnish words and judged their acceptability. EEG and MEG responses were time-locked to both the stimulus onset and the critical point (suffix onset for complex words, uniqueness point for monomorphemic words). The ERP results showed that inflected words elicited a larger left-late…
On-line Construction of Two-Dimensional Suffix Trees
1999
AbstractWe say that a data structure is builton-lineif, at any instant, we have the data structure corresponding to the input we have seen up to that instant. For instance, consider the suffix tree of a stringx[1,n]. An algorithm building iton-lineis such that, when we have read the firstisymbols ofx[1,n], we have the suffix tree forx[1,i]. We present a new technique, which we refer to asimplicit updates, based on which we obtain: (a) an algorithm for theon-lineconstruction of the Lsuffix tree of ann×nmatrixA—this data structure is the two-dimensional analog of the suffix tree of a string; (b) simple algorithms implementing primitive operations forLZ1-typeon-line losslessimage compression m…