Paweł Gawrychowski
Minimal Absent Words in Rooted and Unrooted Trees
We extend the theory of minimal absent words to (rooted and unrooted) trees, having edges labeled by letters from an alphabet \(\varSigma \) of cardinality \(\sigma \). We show that the set \(\text {MAW}(T)\) of minimal absent words of a rooted (resp. unrooted) tree T with n nodes has cardinality \(O(n\sigma )\) (resp. \(O(n^{2}\sigma )\)), and we show that these bounds are realized. Then, we exhibit algorithms to compute all minimal absent words in a rooted (resp. unrooted) tree in output-sensitive time \(O(n+|\text {MAW}(T)|)\) (resp. \(O(n^{2}+|\text {MAW}(T)|)\) assuming an integer alphabet of size polynomial in n.
Substring Complexity in Sublinear Space
Shannon's entropy is a definitive lower bound for statistical compression. Unfortunately, no such clear measure exists for the compressibility of repetitive strings. Thus, ad-hoc measures are employed to estimate the repetitiveness of strings, e.g., the size $z$ of the Lempel-Ziv parse or the number $r$ of equal-letter runs of the Burrows-Wheeler transform. A more recent one is the size $\gamma$ of a smallest string attractor. Unfortunately, Kempa and Prezza [STOC 2018] showed that computing $\gamma$ is NP-hard. Kociumaka et al. [LATIN 2020] considered a new measure that is based on the function $S_T$ counting the cardinalities of the sets of substrings of each length of $T$, also known as …