The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number r_B of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which r_B of the string and of its reverse differ by a multiplicative factor of Θ(log n), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on r_B(s) and r_B(s^rev), as well as on the number of Lyndon factors in the Lyndon factorization of s and s^rev.
Biagi, E., Cenzato, D., Lipták, Z., Romana, G. (2024). On the Number of Equal-Letter Runs of the Bijective Burrows-Wheeler Transform. THEORETICAL COMPUTER SCIENCE [10.1016/j.tcs.2024.115004].
On the Number of Equal-Letter Runs of the Bijective Burrows-Wheeler Transform
Romana, GiuseppeCo-primo
2024-01-01
Abstract
The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number r_B of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which r_B of the string and of its reverse differ by a multiplicative factor of Θ(log n), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on r_B(s) and r_B(s^rev), as well as on the number of Lyndon factors in the Lyndon factorization of s and s^rev.File | Dimensione | Formato | |
---|---|---|---|
article.pdf
Solo gestori archvio
Descrizione: fulltext raggiungibile al seguente link: https://www.sciencedirect.com/science/article/pii/S0304397524006212?via=ihub
Tipologia:
Pre-print
Dimensione
2.87 MB
Formato
Adobe PDF
|
2.87 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.