In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by $\plex$ and $\pom$, that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the $\pom$ order between two primitive words by using a preliminary knowledge of the $\plex$ order of the corresponding Lyndon conjugates. Moreover, we propose an algorithm that efficiently sorts, according to the $\pom$ order, the list of conjugates of a multiset of Lyndon words. Finally, we show that the Lyndon factorization of a word helps the construction of its suffix array, allowing a reduction of the number of symbol comparisons needed to lexicographically sort the suffixes of the word.
Bonomo, S., Mantaci, S., Restivo, A., Rosone, G., Sciortino, M. (2013). Suffixes, Conjugates and Lyndon words. In Developments in Language Theory (pp.131-142). Springer [10.1007/978-3-642-38771-5_13].
Suffixes, Conjugates and Lyndon words
MANTACI, Sabrina;SCIORTINO, Marinella
2013-01-01
Abstract
In this paper we are interested in the study of the combinatorial aspects connecting three important constructions in the field of string algorithms: the suffix array, the Burrows-Wheeler transform (BWT) and the extended Burrows-Wheeler transform (EBWT). Such constructions involve the notions of suffixes and conjugates of words and are based on two different order relations, denoted by $\plex$ and $\pom$, that, even if strictly connected, are quite different from the computational point of view. In this study an important role is played by Lyndon words. In particular, we improve the upper bound on the number of symbol comparisons needed to establish the $\pom$ order between two primitive words by using a preliminary knowledge of the $\plex$ order of the corresponding Lyndon conjugates. Moreover, we propose an algorithm that efficiently sorts, according to the $\pom$ order, the list of conjugates of a multiset of Lyndon words. Finally, we show that the Lyndon factorization of a word helps the construction of its suffix array, allowing a reduction of the number of symbol comparisons needed to lexicographically sort the suffixes of the word.File | Dimensione | Formato | |
---|---|---|---|
BMRRS_LNCS_DLT_2013.pdf
Solo gestori archvio
Descrizione: Articolo principle
Dimensione
225.69 kB
Formato
Adobe PDF
|
225.69 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.