The notion of string attractor has been introduced by Kempa and Prezza (STOC 2018) in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be “attracted”. The smallest size γ∗ of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure γ∗ have been studied in [Mantaci et al., TCS 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating γ∗ on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words in [Schaeffer and Shallit, arXiv 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.
Restivo A., Romana G., Sciortino M. (2022). String Attractors and Infinite Words. In A. Castañeda, F. Rodríguez-Henríquez (a cura di), LATIN 2022: Theoretical Informatics (pp. 426-442) [10.1007/978-3-031-20624-5_26].
String Attractors and Infinite Words
Restivo A.;Romana G.;Sciortino M.
2022-01-01
Abstract
The notion of string attractor has been introduced by Kempa and Prezza (STOC 2018) in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be “attracted”. The smallest size γ∗ of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure γ∗ have been studied in [Mantaci et al., TCS 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating γ∗ on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words in [Schaeffer and Shallit, arXiv 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.File | Dimensione | Formato | |
---|---|---|---|
LATIN2022_String Attractors.pdf
Solo gestori archvio
Tipologia:
Versione Editoriale
Dimensione
403.51 kB
Formato
Adobe PDF
|
403.51 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.