6533b870fe1ef96bd12d074e

RESEARCH PRODUCT

Burrows-Wheeler Transform on Purely Morphic Words

A. FrosiniI. ManciniS. RinaldiG. RomanaM. Sciortino

subject

BWT Morphism Equal-letter runsSettore INF/01 - Informatica

description

The study of the compressibility of repetitive sequences is an issue that is attracting great interest. We consider purely morphic words, which are highly repetitive sequences generated by iterating a morphism φ that admits a fixed point (denoted by φ^∞(a) ) starting from a given character a belonging to the finite alphabet A , i.e. φ^∞(a)=lim_{i→∞}φ^i(a) . Such morphisms are called prolongable on a . Here we focus on the compressibility via the Burrows-Wheeler Transform (BWT) of infinite families of finite sequences generated by morphisms. In particular, denoted by r(w) the number of equal-letter runs of a word w , we provide new upper bounds on r(bwt(φ^i(a))) , i.e. the number of equal-letter runs produced when BWT is applied on φ^i(a) . Such bounds depend on the factor complexity f_x(n) of the infinite word x=φ^∞(a) , that counts, for each n≥0 , the number of distinct factors of x having length n .

10.1109/dcc52660.2022.00063https://hdl.handle.net/10447/575150