0000000000675166

AUTHOR

Sabrina Mantaci

showing 33 related works from this author

Burrows–Wheeler transform and Sturmian words

2003

Burrows–Wheeler transformSignal ProcessingFormal languageSturmian wordArithmeticWord (computer architecture)Computer Science ApplicationsInformation SystemsTheoretical Computer ScienceMathematicsInformation Processing Letters
researchProduct

A New Combinatorial Approach to Sequence Comparison

2008

In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we give a general method for comparing sequences that takes into account how much the characters coming from the different input sequences are mixed in the output…

MultisetTheoretical computer scienceBurrows–Wheeler transformSettore INF/01 - InformaticaComputer scienceBurrows-Wheeler transform; Sequence comparisonString (computer science)Context (language use)Extension (predicate logic)ComparisonInformation theoryGenomeBurrows-Wheeler transform; ComparisonTheoretical Computer ScienceTransformation (function)CategorizationComputational Theory and MathematicsPhylogeneticsSequence comparisonTheory of computationBurrows-Wheeler TransformSequence ComparisonAlgorithmMathematicsData compression
researchProduct

Transducers for the bidirectional decoding of prefix codes

2010

AbstractWe construct a transducer for the bidirectional decoding of words encoded by the method introduced by Girod (1999) in [5] and we prove that it is bideterministic and that it can be used both for the left-to-right and the right-to-left decoding.We also give a similar construction for a transducer that decodes in both directions words encoded by a generalization of Girod’s encoding method. We prove that it has the same properties as those of the previous transducer. In addition we show that it has a single initial/final state and that it is minimal.

Prefix codeGeneral Computer ScienceSettore INF/01 - InformaticaGeneralizationComputer scienceGirod’s encodingTransducersPrefix codeTheoretical Computer SciencePrefixTransducerPrefix codesAlgorithmDecoding methodsWord (computer architecture)Computer Science(all)
researchProduct

An algorithm for the solution of tree equations

1997

We consider the problem of solving equations over k-ary trees. Here an equation is a pair of labeled α-ary trees, where α is a function associating an arity to each label. A solution to an equation is a morphism from α-ary trees to k-ary trees that maps the left and right hand side of the equation to the same k-ary tree.

CombinatoricsMorphismBinary treeBranch and boundSearch algorithmTree (set theory)Function (mathematics)ArityComputer Science::Information TheoryMathematicsEquation solving
researchProduct

Suffixes, Conjugates and Lyndon Words

2013

In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by $\plex$ and $\pom$, that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the $\pom$ order between two primitive wo…

MultisetReduction (recursion theory)BWT; Lyndon factorization; Suffix ArrayString (computer science)Suffix arrayLyndon words Lyndon factorization BWT Suffix array EBWT Circular words ConjugacyLexicographical orderlaw.inventionSuffix ArrayCombinatoricsBWTLyndon factorizationlawOrder (group theory)Symbol (formal)Word (group theory)Mathematics
researchProduct

Distance measures for biological sequences: Some recent approaches

2008

AbstractSequence comparison has become a very essential tool in modern molecular biology. In fact, in biomolecular sequences high similarity usually implies significant functional or structural similarity. Traditional approaches use techniques that are based on sequence alignment able to measure character level differences. However, the recent developments of whole genome sequencing technology give rise to need of similarity measures able to capture the rearrangements involving large segments contained in the sequences. This paper is devoted to illustrate different methods recently introduced for the alignment-free comparison of biological sequences. Goal of the paper is both to highlight t…

Whole genome sequencingComputer sciencebusiness.industryApplied MathematicsSequence alignmentMachine learningcomputer.software_genreBioinformaticsMeasure (mathematics)GenomeDistance measuresSimilitudeTheoretical Computer ScienceArtificial IntelligenceSimilarity (psychology)Metric (mathematics)Artificial intelligencebusinesscomputerSoftwareInternational Journal of Approximate Reasoning
researchProduct

DEFECT THEOREMS FOR TREES

2000

We generalize different notions of a rank of a set of words to sets of trees. We prove that almost all of those ranks can be used to formulate a defect theorem. However, as we show, the prefix rank forms an exception.

Discrete mathematicsPrefixCombinatoricsSet (abstract data type)Combinatorics on wordsAlgebra and Number TheoryComputational Theory and MathematicsInformationSystems_INFORMATIONSTORAGEANDRETRIEVALRank (graph theory)Computer Science::Formal Languages and Automata TheoryInformation SystemsTheoretical Computer ScienceMathematicsDevelopments In Language Theory
researchProduct

Preface

2015

This special issue of Mathematical Structures in Computer Science is devoted to the fourteenth Italian Conference on Theoretical Computer Science (ICTCS) held at University of Palermo, Italy, from 9th to 11th September 2013. ICTCS is the conference of the Italian Chapter of the European Association for Theoretical Computer Science and covers a wide spectrum of topics in Theoretical Computer Science, ranging from computational complexity to logic, from algorithms and data structure to programming languages, from combinatorics on words to distributed computing. For this reason, the contributions here included come from very different areas of Theoretical Computer Science. In fact this special…

Mathematics (miscellaneous)Computer Science Applications1707 Computer Vision and Pattern RecognitionComputer Science Applications
researchProduct

BALANCE PROPERTIES AND DISTRIBUTION OF SQUARES IN CIRCULAR WORDS

2010

We study balance properties of circular words over alphabets of size greater than two. We give some new characterizations of balanced words connected to the Kawasaki-Ising model and to the notion of derivative of a word. Moreover we consider two different generalizations of the notion of balance, and we find some relations between them. Some of our results can be generalized to non periodic infinite words as well.

combinatoria delle parole parole circolari parole bilanciateCombinatoricsCombinatorics on wordsSettore INF/01 - InformaticaComputer Science (miscellaneous)Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Computer Science::Formal Languages and Automata TheoryMathematicsInternational Journal of Foundations of Computer Science
researchProduct

Some Investigations on Similarity Measures Based on Absent Words

2019

In this paper we investigate similarity measures based on minimal absent words, introduced by Chairungsee and Crochemore in [1]. They make use of a length-weighted index on a sample set corresponding to the symmetric difference M(x)ΔM(y) of the minimal absent words M(x) and M(y) of two sequences x and y, respectively. We first propose a variant of this measure by choosing as a sample set a proper subset (x, y) of M(x)ΔM(y), which appears to be more appropriate for distinguishing x and y. From the algebraic point of view, we prove that (x, y) is the base of the ideal generated by M(x)ΔM(y). We then remark that such measures are able to recognize whether the sequences x and y share a common s…

sequence comparisonAlgebra and Number TheorySettore INF/01 - Informaticabusiness.industryComputer sciencePattern recognitionsimilarity measuresMinimal absent wordsTheoretical Computer ScienceComputational Theory and MathematicsSimilarity (network science)Artificial intelligencebusinessInformation SystemsFundamenta Informaticae
researchProduct

A Generalization of Girod’s Bidirectional Decoding Method to Codes with a Finite Deciphering Delay

2012

In this paper we generalize an encoding method due to Girod (cf. [6]) using prefix codes, that allows a bidirectional decoding of the encoded messages. In particular we generalize it to any finite alphabet A, to any operation defined on A, to any code with finite deciphering delay and to any key x ∈ A+ , on a length depending on the deciphering delay. We moreover define, as in [4], a deterministic transducer for such generalized method. We prove that, fixed a code X ∈ A* with finite deciphering delay and a key x ∈ A *, the transducers associated to different operations are isomorphic as unlabelled graphs. We also prove that, for a fixed code X with finite deciphering delay, transducers asso…

Discrete mathematicsPrefix codeStrongly connected componentSettore INF/01 - InformaticaGeneralization020206 networking & telecommunications0102 computer and information sciences02 engineering and technology01 natural sciencesPrefix010201 computation theory & mathematicsEncoding (memory)0202 electrical engineering electronic engineering information engineeringCode (cryptography)AlphabetGirod's encoding codes finite deciphering delayDecoding methodsMathematics
researchProduct

An extension of the Burrows-Wheeler Transform

2007

AbstractWe describe and highlight a generalization of the Burrows–Wheeler Transform (bwt) to a multiset of words. The extended transformation, denoted by ebwt, is reversible. Moreover, it allows to define a bijection between the words over a finite alphabet A and the finite multisets of conjugacy classes of primitive words in A∗. Besides its mathematical interest, the extended transform can be useful for applications in the context of string processing. In the last part of this paper we illustrate one such application, providing a similarity measure between sequences based on ebwt.

Discrete mathematicsMultisetSimilarity (geometry)General Computer ScienceBurrows–Wheeler transformGeneralizationAlignment-free distance measure; Burrows-Wheeler transform; Sequence comparisonContext (language use)Similarity measureBurrows-Wheeler transformSequence comparisonTheoretical Computer ScienceConjugacy classBijectionAlignment-free distance measureBurrows–Wheeler transformComputer Science::Formal Languages and Automata TheoryComputer Science(all)Mathematics
researchProduct

Suffix array and Lyndon factorization of a text

2014

Abstract The main goal of this paper is to highlight the relationship between the suffix array of a text and its Lyndon factorization. It is proved in [15] that one can obtain the Lyndon factorization of a text from its suffix array. Conversely, here we show a new method for constructing the suffix array of a text that takes advantage of its Lyndon factorization. The surprising consequence of our results is that, in order to construct the suffix array, the local suffixes inside each Lyndon factor can be separately processed, allowing different implementative scenarios, such as online, external and internal memory, or parallel implementations. Based on our results, the algorithm that we prop…

Sorting suffixes; BWT; Suffix array; Lyndon word; Lyndon factorizationCompressed suffix arraySettore INF/01 - InformaticaSorting suffixesGeneralized suffix treeSuffix arrayOrder (ring theory)Construct (python library)Lyndon wordSorting suffixeTheoretical Computer Sciencelaw.inventionBWTLyndon factorizationComputational Theory and MathematicsFactorizationlawSuffix arrayFactor (programming language)Internal memoryDiscrete Mathematics and CombinatoricsArithmeticcomputerMathematicscomputer.programming_languageJournal of Discrete Algorithms
researchProduct

Inducing the Lyndon Array

2019

In this paper we propose a variant of the induced suffix sorting algorithm by Nong (TOIS, 2013) that computes simultaneously the Lyndon array and the suffix array of a text in $O(n)$ time using $\sigma + O(1)$ words of working space, where $n$ is the length of the text and $\sigma$ is the alphabet size. Our result improves the previous best space requirement for linear time computation of the Lyndon array. In fact, all the known linear algorithms for Lyndon array computation use suffix sorting as a preprocessing step and use $O(n)$ words of working space in addition to the Lyndon array and suffix array. Experimental results with real and synthetic datasets show that our algorithm is not onl…

FOS: Computer and information sciences050101 languages & linguisticsComputer scienceComputationInduced suffix sorting02 engineering and technologySpace (mathematics)law.inventionSuffix sortinglawSuffix arrayComputer Science - Data Structures and Algorithms0202 electrical engineering electronic engineering information engineeringData_FILESPreprocessorData Structures and Algorithms (cs.DS)0501 psychology and cognitive sciencesComputer Science::Data Structures and AlgorithmsTime complexitySettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSettore INF/01 - Informatica05 social sciencesLightweight algorithmSuffix arraySigmaComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Induced suffix sorting; Lightweight algorithms; Lyndon array; Suffix arrayWorking spaceLyndon arrayLightweight algorithms020201 artificial intelligence & image processingAlgorithmComputer Science::Formal Languages and Automata Theory
researchProduct

A Generalization of Girod's Bidirectional Decoding Method to Codes with a Finite Deciphering Delay

2012

Girod’s encoding method has been introduced in order to efficiently decode from both directions messages encoded by using finite prefix codes. In the present paper, we generalize this method to finite codes with a finite deciphering delay. In particular, we show that our decoding algorithm can be realized by a deterministic finite transducer. We also investigate some properties of the underlying unlabeled graph.

Prefix codeStrongly connected componentTheoretical computer scienceGeneralizationdeciphering delayData_CODINGANDINFORMATIONTHEORY0102 computer and information sciences02 engineering and technology01 natural sciences[INFO.INFO-FL]Computer Science [cs]/Formal Languages and Automata Theory [cs.FL]Encoding (memory)0202 electrical engineering electronic engineering information engineeringCode (cryptography)Computer Science (miscellaneous)prefix (free) codeunlabeled graphMathematicsCode[MATH.MATH-IT]Mathematics [math]/Information Theory [math.IT]020206 networking & telecommunicationsCode; deciphering delay; prefix (free) code; strongly connected component; transducer; unlabeled graph; Computer Science (miscellaneous)Prefixtransducer[INFO.INFO-IT]Computer Science [cs]/Information Theory [cs.IT]010201 computation theory & mathematicsGraph (abstract data type)strongly connected componentAlgorithmDecoding methods
researchProduct

On fixed points of the Burrows-Wheeler transform

2017

The Burrows-Wheeler Transform is a well known transformation widely used in Data Compression: important competitive compression software, such as Bzip (cf. [1]) and Szip (cf. [2]) and some indexing software, like the FM-index (cf. [3]), are deeply based on the Burrows Wheeler Transform. The main advantage of using BWT for data compression consists in its feature of "clustering" together equal characters. In this paper we show the existence of fixed points of BWT, i.e., words on which BWT has no effect. We show a characterization of the permutations associated to BWT of fixed points and we give the explicit form of fixed points on a binary ordered alphabet a, b having at most four b's and th…

Discrete mathematicsAlgebra and Number TheoryBurrows–Wheeler transformSettore INF/01 - InformaticaPermutationPermutations0102 computer and information sciences02 engineering and technologyInformation SystemFixed point01 natural sciencesTheoretical Computer ScienceComputational Theory and Mathematics010201 computation theory & mathematicsFixed PointFixed Points0202 electrical engineering electronic engineering information engineeringBurrows-Wheeler Transform; Fixed Points; Permutations; Theoretical Computer Science; Algebra and Number Theory; Information Systems; Computational Theory and Mathematics020201 artificial intelligence & image processingBurrows-Wheeler TransformInformation SystemsMathematics
researchProduct

Burrows-Wheeler transform and Run-Length Enconding

2017

In this paper we study the clustering effect of the Burrows-Wheeler Transform (BWT) from a combinatorial viewpoint. In particular, given a word w we define the BWT-clustering ratio of w as the ratio between the number of clusters produced by BWT and the number of the clusters of w. The number of clusters of a word is measured by its Run-Length Encoding. We show that the BWT-clustering ratio ranges in ]0, 2]. Moreover, given a rational number \(r\,\in \,]0,2]\), it is possible to find infinitely many words having BWT-clustering ratio equal to r. Finally, we show how the words can be classified according to their BWT-clustering ratio. The behavior of such a parameter is studied for very well-…

Discrete mathematicsRational numberBurrows–Wheeler transformComputer scienceComputer Science (all)0102 computer and information sciences02 engineering and technologyBurrows-Wheeler transform01 natural sciencesBurrows-Wheeler transform; Clustering effect; Run-length encoding; Theoretical Computer Science; Computer Science (all)Theoretical Computer ScienceClustering effect010201 computation theory & mathematicsRun-length encoding0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingCluster analysisWord (computer architecture)Run-length encoding
researchProduct

SORTING CONJUGATES AND SUFFIXES OF WORDS IN A MULTISET

2014

In this paper we are interested in the study of the combinatorial aspects related to the extension of the Burrows-Wheeler transform to a multiset of words. Such study involves the notion of suffixes and conjugates of words and is based on two different order relations, denoted by <lex and ≺ω, that, even if strictly connected, are quite different from the computational point of view. In particular, we introduce a method that only uses the <lex sorting among suffixes of a multiset of words in order to sort their conjugates according to ≺ω-order. In this study an important role is played by Lyndon words. This strategy could be used in applications specially in the field of Bioinformatic…

Lyndon words; Burrows-Wheeler transform; Extended Burrows-Wheeler transform; Circular words; Conjugates; Suffixes; SortingSuffixesMultisetTheoretical computer sciencePoint (typography)Burrows–Wheeler transformSettore INF/01 - InformaticaSortingcircular wordExtension (predicate logic)Lyndon wordsBurrows-Wheeler transformLyndon wordField (computer science)ConjugatesconjugateComputer Science (miscellaneous)sortOrder (group theory)suffixeArithmeticextended Burrows-Wheeler transformCircular wordssortingMathematics
researchProduct

On the decomposition of prefix codes

2017

Abstract In this paper we focus on the decomposition of rational and maximal prefix codes. We present an effective procedure that allows us to decide whether such a code is decomposable. In this case, the procedure also produces the factors of some of its decompositions. We also give partial results on the problem of deciding whether a rational maximal prefix code decomposes over a finite prefix code.

Block codePrefix codeGeneral Computer ScienceComputer science0102 computer and information sciences02 engineering and technologyPrefix grammarKraft's inequality01 natural sciencesPrefix codeTheoretical Computer SciencePrefix codes; Finite automata; Composition of codesComposition of codes0202 electrical engineering electronic engineering information engineeringDiscrete mathematicsSelf-synchronizing codeFinite-state machineSettore INF/01 - InformaticaComputer Science (all)Rational languageLinear codePrefixComposition of code010201 computation theory & mathematicsPrefix codes020201 artificial intelligence & image processingFinite automataComputer Science::Formal Languages and Automata Theory
researchProduct

A combinatorial view on string attractors

2021

Abstract The notion of string attractor has recently been introduced in [Prezza, 2017] and studied in [Kempa and Prezza, 2018] to provide a unifying framework for known dictionary-based compressors. A string attractor for a word w = w 1 w 2 ⋯ w n is a subset Γ of the positions { 1 , … , n } , such that all distinct factors of w have an occurrence crossing at least one of the elements of Γ. In this paper we explore the notion of string attractor by focusing on its combinatorial properties. In particular, we show how the size of the smallest string attractor of a word varies when combinatorial operations are applied and we deduce that such a measure is not monotone. Moreover, we introduce a c…

General Computer ScienceSettore INF/01 - InformaticaString (computer science)de Bruijn word0102 computer and information sciences02 engineering and technologyCharacterization (mathematics)Burrows-Wheeler transform01 natural sciencesMeasure (mathematics)Standard Sturmian wordTheoretical Computer ScienceCombinatoricsConjugacy classMonotone polygonString attractor010201 computation theory & mathematicsAttractorThue-Morse word0202 electrical engineering electronic engineering information engineeringLempel-Ziv encoding020201 artificial intelligence & image processingWord (group theory)Mathematics
researchProduct

Balance Properties and Distribution of Squares in Circular Words

2008

We study balance properties of circular words over alphabets of size greater than two. We give some new characterizations of balanced words connected to the Kawasaki-Ising model and to the notion of derivative of a word. Moreover we consider two different generalizations of the notion of balance, and we find some relations between them. Some of our results can be generalised to non periodic infinite words as well.

CombinatoricsBalance (metaphysics)Distribution (number theory)Settore INF/01 - InformaticaComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Combinatoria delle Parole Parole Sturmiane parole circolari Parole BilanciateComputer Science::Formal Languages and Automata TheoryBinary alphabetWord (group theory)Mathematics
researchProduct

Measuring the clustering effect of BWT via RLE

2017

Abstract The Burrows–Wheeler Transform (BWT) is a reversible transformation on which are based several text compressors and many other tools used in Bioinformatics and Computational Biology. The BWT is not actually a compressor, but a transformation that performs a context-dependent permutation of the letters of the input text that often create runs of equal letters (clusters) longer than the ones in the original text, usually referred to as the “clustering effect” of BWT. In particular, from a combinatorial point of view, great attention has been given to the case in which the BWT produces the fewest number of clusters (cf. [5] , [16] , [21] , [23] ). In this paper we are concerned about t…

0301 basic medicineGeneral Computer SciencePermutationComputer Science (all)Binary number0102 computer and information sciencesQuantitative Biology::Genomics01 natural sciencesUpper and lower boundsTheoretical Computer ScienceCombinatorics03 medical and health sciencesPermutation030104 developmental biologyTransformation (function)BWT010201 computation theory & mathematicsRun-length encodingComputer Science::Data Structures and AlgorithmsCluster analysisPrimitive root modulo nBWT; Permutation; Run-length encoding; Theoretical Computer Science; Computer Science (all)Word (computer architecture)Run-length encodingMathematics
researchProduct

On the size of transducers for bidirectional decoding of prefix codes

2012

In a previous paper [L. Giambruno and S. Mantaci, Theoret. Comput. Sci. 411 (2010) 1785–1792] a bideterministic transducer is defined for the bidirectional deciphering of words by the method introduced by Girod [ IEEE Commun. Lett. 3 (1999) 245–247]. Such a method is defined using prefix codes. Moreover a coding method, inspired by the Girod’s one, is introduced, and a transducer that allows both right-to-left and left-to-right decoding by this method is defined. It is proved also that this transducer is minimal. Here we consider the number of states of such a transducer, related to some features of the considered prefix code X . We find some bounds of such a number of states in relation wi…

Discrete mathematicsPrefix codeBlock codeSettore INF/01 - InformaticaGeneral MathematicsConcatenated error correction codeprefix codeList decodingSerial concatenated convolutional codesSequential decodingLinear codeComputer Science ApplicationsPrefixbilateral decodingVariable length codetransducersAlgorithmComputer Science::Formal Languages and Automata TheorySoftwareMathematics
researchProduct

Equations on trees

1996

We introduce the notion of equation on trees, generalizing the corresponding notion for words, and we develop the first steps of a theory of tree equations. The main result of the paper states that, if a pair of trees is the solution of a tree equation with two indeterminates, then the two trees are both powers of the same tree. As an application, we show that a tree can be expressed in a unique way as a power of a primitive tree. This extends a basic result of combinatorics on words to trees. Some open problems are finally proposed.

Discrete mathematicsTree (data structure)Combinatorics on wordsBinary treeTree codeMathematics
researchProduct

An extension of the Burrows-Wheeler Transform and applications to sequence comparison and data compression

2005

We introduce a generalization of the Burrows-Wheeler Transform (BWT) that can be applied to a multiset of words. The extended transformation, denoted by E, is reversible, but, differently from BWT, it is also surjective. The E transformation allows to give a definition of distance between two sequences, that we apply here to the problem of the whole mitochondrial genome phylogeny. Moreover we give some consideration about compressing a set of words by using the E transformation as preprocessing.

Discrete mathematicsMultisetBurrows-Wheeler transform; Data Compression; Mitochondrial genome phylogenyBurrows–Wheeler transformMultiplicity (mathematics)Mitochondrial genome phylogenyBurrows-Wheeler transformData CompressionSurjective functionConjugacy classSequence comparisonPreprocessorAlgorithmMathematicsData compression
researchProduct

Isometric Words Based on Swap and Mismatch Distance

2023

An edit distance is a metric between words that quantifies how two words differ by counting the number of edit operations needed to transform one word into the other one. A word f is said isometric with respect to an edit distance if, for any pair of f-free words u and v, there exists a transformation of minimal length from u to v via the related edit operations such that all the intermediate words are also f-free. The adjective 'isometric' comes from the fact that, if the Hamming distance is considered (i.e., only mismatches), then isometric words are connected with definitions of isometric subgraphs of hypercubes. We consider the case of edit distance with swap and mismatch. We compare it…

FOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySwap and mismatch distance Isometric words Overlap with errors
researchProduct

Comparing Sequences by the Burrows-Wheeler Transform

2004

researchProduct

Formal Languages and Automata: Models, Methods and Application. In Honour of the 70th Birthday of Antonio Restivo Preface

2017

Automata Theory
researchProduct

An extension of the Burrows-Wheeler Transform to k words

2005

researchProduct

The Burrows-Wheeler Transform: from data compression to combinatorics on words

2005

researchProduct

A new sequence distance measure based on the Burrows-Wheeler Transform

2005

researchProduct

The Burrows-Wheeler Transform: a new tool in Combinatorics on Words

2005

researchProduct

String attractors and combinatorics on words

2019

The notion of \emph{string attractor} has recently been introduced in [Prezza, 2017] and studied in [Kempa and Prezza, 2018] to provide a unifying framework for known dictionary-based compressors. A string attractor for a word $w=w[1]w[2]\cdots w[n]$ is a subset $\Gamma$ of the positions $\{1,\ldots,n\}$, such that all distinct factors of $w$ have an occurrence crossing at least one of the elements of $\Gamma$. While finding the smallest string attractor for a word is a NP-complete problem, it has been proved in [Kempa and Prezza, 2018] that dictionary compressors can be interpreted as algorithms approximating the smallest string attractor for a given word. In this paper we explore the noti…

FOS: Computer and information sciencesSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniSettore INF/01 - InformaticaFormal Languages and Automata Theory (cs.FL)De Brujin wordComputer Science - Formal Languages and Automata TheoryBurrows-Wheeler transformString attractorComputer Science - Data Structures and AlgorithmsThue-Morse wordLempel-Ziv encodingBurrows-Wheeler transform; De Brujin word; Lempel-Ziv encoding; Run-length encoding; String attractor; Thue-Morse wordData Structures and Algorithms (cs.DS)Run-length encoding
researchProduct