The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the “myriad virtues” of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words

Giancarlo R., Manzini G., Rosone G., Sciortino M. (2019). A new class of searchable and provably highly compressible string transformations. In N. Pisanti, S.P. Pissis (a cura di), Leibniz International Proceedings in Informatics, LIPIcs (pp. 1-12). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2019.12].

A new class of searchable and provably highly compressible string transformations

Giancarlo R.
;
Sciortino M.
2019-01-01

Abstract

The Burrows-Wheeler Transform is a string transformation that plays a fundamental role for the design of self-indexing compressed data structures. Over the years, researchers have successfully extended this transformation outside the domains of strings. However, efforts to find non-trivial alternatives of the original, now 25 years old, Burrows-Wheeler string transformation have met limited success. In this paper we bring new lymph to this area by introducing a whole new family of transformations that have all the “myriad virtues” of the BWT: they can be computed and inverted in linear time, they produce provably highly compressible strings, and they support linear time pattern search directly on the transformed string. This new family is a special case of a more general class of transformations based on context adaptive alphabet orderings, a concept introduced here. This more general class includes also the Alternating BWT, another invertible string transforms recently introduced in connection with a generalization of Lyndon words
2019
978-395977103-0
Giancarlo R., Manzini G., Rosone G., Sciortino M. (2019). A new class of searchable and provably highly compressible string transformations. In N. Pisanti, S.P. Pissis (a cura di), Leibniz International Proceedings in Informatics, LIPIcs (pp. 1-12). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing [10.4230/LIPIcs.CPM.2019.12].
File in questo prodotto:
File Dimensione Formato  
LIPIcs-CPM-2019-12.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 437.23 kB
Formato Adobe PDF
437.23 kB Adobe PDF Visualizza/Apri
CPM_2019___postprint.pdf

accesso aperto

Tipologia: Post-print
Dimensione 447.33 kB
Formato Adobe PDF
447.33 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/361353
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact