Search results for "Suffix"

showing 10 items of 75 documents

Suffixes, Conjugates and Lyndon Words

2013

In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by $\plex$ and $\pom$, that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the $\pom$ order between two primitive wo…

MultisetReduction (recursion theory)BWT; Lyndon factorization; Suffix ArrayString (computer science)Suffix arrayLyndon words Lyndon factorization BWT Suffix array EBWT Circular words ConjugacyLexicographical orderlaw.inventionSuffix ArrayCombinatoricsBWTLyndon factorizationlawOrder (group theory)Symbol (formal)Word (group theory)Mathematics
researchProduct

r-Indexing the eBWT

2021

The extended Burrows Wheeler Transform (\(\mathrm {eBWT}\)) was introduced by Mantaci et al. [TCS 2007] to extend the definition of the \(\mathrm {BWT}\) to a collection of strings. In our prior work [SPIRE 2021], we give a linear-time algorithm for the \(\mathrm {eBWT}\) that preserves the fundamental property of the original definition (i.e., the independence from the input order). The algorithm combines a modification of the Suffix Array Induced Sorting (SAIS) algorithm [IEEE Trans Comput 2011] with Prefix Free Parsing [AMB 2019; JCB 2020]. In this paper, we show how this construction algorithm leads to r-indexing the \(\mathrm {eBWT}\), i.e., run-length encoded \(\mathrm {eBWT}\) and \(…

Physicsstring compressionBurrows–Wheeler transformSettore INF/01 - InformaticaSearch engine indexingSuffix arrayOrder (ring theory)Burrows-Wheeler-Transform r-index string compression extended BWT compressed indexingBurrows-Wheeler-Transformlaw.inventionCombinatoricsr-indexcompressed indexinglawIndexingextended BWT
researchProduct

Suffix Automata and Standard Sturmian Words

2007

Blumer et al. showed (cf. [3,2]) that the suffix automaton of a word w must have at least |w|+1 states and at most 2|w|-1 states. In this paper we characterize the language L of all binary words w whose minimal suffix automaton S(w) has exactly |w| + 1 states; they are precisely all prefixes of standard Sturmian words. In particular, we give an explicit construction of suffix automaton of words that are palindromic prefixes of standard words. Moreover, we establish a necessary and sufficient condition on S(w) which ensures that if w ∈ L and a ∈ {0, 1} then wa ∈ L. By using such a condition, we show how to construct the automaton S(wa) from S(w). More generally, we provide a simple construct…

PrefixCombinatoricsSettore INF/01 - InformaticaLevenshtein automatonSimple (abstract algebra)PalindromeSuffix automatonSuffix AutomataArithmeticSuffixWord (group theory)AutomatonMathematics
researchProduct

Quantifying English and Polish Lolitas: A Corpus-Driven Stylistic Comparison

2013

The study presented in this article, which is a fragment of a larger study of translational and non- translational texts (Grabowski 2012), falls within the scope of descriptive translation studies (DTS) and corpus linguistics, with particular emphasis on the study of translation universals, on the example of English-original (written in 1955) and two independent Polish translations of the novel Lolita by V. Nabokov (by Stiller in 1991 and Klobukowski in 1997). According to Baker (1995: 243), universal features of translation or translation universals, constitute specific textual characteristics (e.g. lexical, grammatical or stylistic) typical of translated texts, irrespective of languages i…

PrefixHistorySentence lengthDescriptive statisticsCorpus linguisticsTranslation studiesWord typeSuffixProblem of universalsLinguistics
researchProduct

La sufixació en la premsa valenciana del segle XIX

2018

El Mole, una de les primeres i més rellevants manifestacions periodístiques en llengua catalana, va partir d’un model lingüístic popular i modern, basat en la literatura popularista contemporània, que els redactors van procurar combinar amb un registre més formal segons el tipus d’article. L’actualitat del periòdic els duu a reflectir el llenguatge de l’època, però també a practicar la formació de noves paraules a través dels recursos interns de l’idioma, com la derivació. En aquest article estudiem els sufixos que apareixen en El Mole, i veiem com expressen en part la realitat lingüística del català del segle XIX en la seua varietat valenciana, i en part la creativitat dels…

PremsaPress; 19th century; Valencian country; Journalistic language; Word formation; SuffixationLlenguatge periodísticPremsa; Segle XIX; País Valencià; Llenguatge periodístic; Formació de paraules; sufixacióGeneral MedicineSegle XIXPaís ValenciàsufixacióFormació de paraulesAnuari de filologia. Estudis de lingüística
researchProduct

Approximation through suffixation:-ḍḍu/-a in Sicilian

2023

This article investigates the use of the Sicilian suffix -ḍḍu/-afor the expression of ap-proximation. On the basis of a survey of a corpus of ethnotexts and the outputs of a translation questionnaire, we propose that approximation is a core value in the semantic network of the suffix, expressing a certain distance fromthe default values conveyed by the base. In terms of prototypi-cality, this distance may occur from the categorial centre (internal approximation): the suffix mod-ifies the semantics of the base but does not alter the categorial status of the referent. Alternatively, the suffix may impact on categorial membership tout court (external approximation), questioning the categorial …

Settore L-FIL-LET/12 - Linguistica ItalianaSicilian evaluative suffixes approximation attenuation high specificity diminution pragmatic nuancesSettore L-LIN/01 - Glottologia E Linguistica
researchProduct

Suffix array and Lyndon factorization of a text

2014

Abstract The main goal of this paper is to highlight the relationship between the suffix array of a text and its Lyndon factorization. It is proved in [15] that one can obtain the Lyndon factorization of a text from its suffix array. Conversely, here we show a new method for constructing the suffix array of a text that takes advantage of its Lyndon factorization. The surprising consequence of our results is that, in order to construct the suffix array, the local suffixes inside each Lyndon factor can be separately processed, allowing different implementative scenarios, such as online, external and internal memory, or parallel implementations. Based on our results, the algorithm that we prop…

Sorting suffixes; BWT; Suffix array; Lyndon word; Lyndon factorizationCompressed suffix arraySettore INF/01 - InformaticaSorting suffixesGeneralized suffix treeSuffix arrayOrder (ring theory)Construct (python library)Lyndon wordSorting suffixeTheoretical Computer Sciencelaw.inventionBWTLyndon factorizationComputational Theory and MathematicsFactorizationlawSuffix arrayFactor (programming language)Internal memoryDiscrete Mathematics and CombinatoricsArithmeticcomputerMathematicscomputer.programming_languageJournal of Discrete Algorithms
researchProduct

Special factors and the combinatorics of suffix and factor automata

2011

AbstractThe suffix automaton (resp. factor automaton) of a finite word w is the minimal deterministic automaton recognizing the set of suffixes (resp. factors) of w. We study the relationships between the structure of the suffix and factor automata and classical combinatorial parameters related to the special factors of w. We derive formulae for the number of states of these automata. We also characterize the languages LSA and LFA of words having respectively suffix automaton and factor automaton with the minimal possible number of states.

Special factorGeneral Computer ScienceSpecial factorsFactor automatonBüchi automatonω-automatonTheoretical Computer ScienceCombinatoricsDeterministic automatonTwo-way deterministic finite automatonNondeterministic finite automatonComputer Science::Data Structures and AlgorithmsCombinatorics on wordStandard Sturmian wordsMathematicsDiscrete mathematicsCombinatorics on wordsDAWGPushdown automatonComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Nonlinear Sciences::Cellular Automata and Lattice GasesSuffix automatonProbabilistic automatonSuffix automatonComputer Science::Formal Languages and Automata TheoryComputer Science(all)Theoretical Computer Science
researchProduct

Spatiotemporal Dynamics of the Processing of Spoken Inflected and Derived Words:A Combined EEG and MEG Study

2011

The spatiotemporal dynamics of the neural processing of spoken morphologically complex words are still an open issue. In the current study, we investigated the time course and neural sources of spoken inflected and derived words using simultaneously recorded electroencephalography (EEG) and magnetoencephalography (MEG) responses. Ten participants (native speakers) listened to inflected, derived, and monomorphemic Finnish words and judged their acceptability. EEG and MEG responses were time-locked to both the stimulus onset and the critical point (suffix onset for complex words, uniqueness point for monomorphemic words). The ERP results showed that inflected words elicited a larger left-late…

Speech recognitionElectroencephalographyStimulus (physiology)Lexiconcomputer.software_genre050105 experimental psychologylcsh:RC321-57103 medical and health sciencesBehavioral Neuroscience0302 clinical medicineMorphememorphologymedicine0501 psychology and cognitive sciencesauditorylcsh:Neurosciences. Biological psychiatry. NeuropsychiatryBiological PsychiatryOriginal ResearchTemporal cortexMEGmedicine.diagnostic_testbusiness.industry05 social sciencesderivedMagnetoencephalographyPsychiatry and Mental healthNeuropsychology and Physiological PsychologyNeurologyTime courselexiconArtificial intelligenceSuffixinfectedbusinessPsychologycomputer030217 neurology & neurosurgeryNatural language processingERPNeuroscience
researchProduct

On-line Construction of Two-Dimensional Suffix Trees

1999

AbstractWe say that a data structure is builton-lineif, at any instant, we have the data structure corresponding to the input we have seen up to that instant. For instance, consider the suffix tree of a stringx[1,n]. An algorithm building iton-lineis such that, when we have read the firstisymbols ofx[1,n], we have the suffix tree forx[1,i]. We present a new technique, which we refer to asimplicit updates, based on which we obtain: (a) an algorithm for theon-lineconstruction of the Lsuffix tree of ann×nmatrixA—this data structure is the two-dimensional analog of the suffix tree of a string; (b) simple algorithms implementing primitive operations forLZ1-typeon-line losslessimage compression m…

Statistics and ProbabilityCompressed suffix arrayNumerical AnalysisControl and OptimizationAlgebra and Number TheoryTheoretical computer scienceApplied MathematicsGeneral MathematicsSuffix treeString (computer science)Generalized suffix treelaw.inventionLongest common substring problemTree (data structure)lawSuffixAlgorithmFM-indexMathematicsJournal of Complexity
researchProduct