The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number r_B of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which r_B of the string and of its reverse differ by a multiplicative factor of Θ(log n), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on r_B(s) and r_B(s^rev), as well as on the number of Lyndon factors in the Lyndon factorization of s and s^rev.

Biagi, E., Cenzato, D., Lipták, Z., Romana, G. (2024). On the Number of Equal-Letter Runs of the Bijective Burrows-Wheeler Transform. THEORETICAL COMPUTER SCIENCE [10.1016/j.tcs.2024.115004].

On the Number of Equal-Letter Runs of the Bijective Burrows-Wheeler Transform

Romana, Giuseppe
Co-primo
2024-01-01

Abstract

The Bijective Burrows-Wheeler Transform (BBWT) is a variant of the famous BWT [Burrows and Wheeler, 1994]. The BBWT was introduced by Gil and Scott in 2012, and is based on the extended BWT of Mantaci et al. [TCS 2007] and on the Lyndon factorization of the input string. In the original paper, the compression achieved with the BBWT was shown to be competitive with that of the BWT, and it has been gaining interest in recent years. In this work, we present the first study of the number r_B of runs of the BBWT, which is a measure of its compression power. We exhibit an infinite family of strings on which r_B of the string and of its reverse differ by a multiplicative factor of Θ(log n), where n is the length of the string. We also give several theoretical results on the BBWT, including a characterization of binary strings for which the BBWT has two runs. Finally, we present experimental results and statistics on r_B(s) and r_B(s^rev), as well as on the number of Lyndon factors in the Lyndon factorization of s and s^rev.
2024
Settore INFO-01/A - Informatica
Biagi, E., Cenzato, D., Lipták, Z., Romana, G. (2024). On the Number of Equal-Letter Runs of the Bijective Burrows-Wheeler Transform. THEORETICAL COMPUTER SCIENCE [10.1016/j.tcs.2024.115004].
File in questo prodotto:
File Dimensione Formato  
article.pdf

Solo gestori archvio

Descrizione: fulltext raggiungibile al seguente link: https://www.sciencedirect.com/science/article/pii/S0304397524006212?via=ihub
Tipologia: Pre-print
Dimensione 2.87 MB
Formato Adobe PDF
2.87 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/665262
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact