The extended Burrows Wheeler Transform (eBWT ) was introduced by Mantaci et al. [TCS 2007] to extend the definition of the BWT to a collection of strings. In our prior work [SPIRE 2021], we give a linear-time algorithm for the eBWT that preserves the fundamental property of the original definition (i.e., the independence from the input order). The algorithm combines a modification of the Suffix Array Induced Sorting (SAIS) algorithm [IEEE Trans Comput 2011] with Prefix Free Parsing [AMB 2019; JCB 2020]. In this paper, we show how this construction algorithm leads to r-indexing the eBWT, i.e., run-length encoded eBWT and SA samples of Gagie et al. [SODA 2018] can be constructed efficiently from the components of the PFP. Moreover, we show that finding maximal exact matches (MEMs) between a query string and the r-index of the eBWT can be efficiently supported.

Boucher C., Cenzato D., Liptak Z., Rossi M., Sciortino M. (2021). r-Indexing the eBWT. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 3-12). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-86692-1_1].

r-Indexing the eBWT

Sciortino M.
2021-01-01

Abstract

The extended Burrows Wheeler Transform (eBWT ) was introduced by Mantaci et al. [TCS 2007] to extend the definition of the BWT to a collection of strings. In our prior work [SPIRE 2021], we give a linear-time algorithm for the eBWT that preserves the fundamental property of the original definition (i.e., the independence from the input order). The algorithm combines a modification of the Suffix Array Induced Sorting (SAIS) algorithm [IEEE Trans Comput 2011] with Prefix Free Parsing [AMB 2019; JCB 2020]. In this paper, we show how this construction algorithm leads to r-indexing the eBWT, i.e., run-length encoded eBWT and SA samples of Gagie et al. [SODA 2018] can be constructed efficiently from the components of the PFP. Moreover, we show that finding maximal exact matches (MEMs) between a query string and the r-index of the eBWT can be efficiently supported.
2021
978-3-030-86691-4
978-3-030-86692-1
Boucher C., Cenzato D., Liptak Z., Rossi M., Sciortino M. (2021). r-Indexing the eBWT. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 3-12). Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-86692-1_1].
File in questo prodotto:
File Dimensione Formato  
Boucher2021_Chapter_R-IndexingTheEBWT.pdf

Solo gestori archvio

Tipologia: Versione Editoriale
Dimensione 321.12 kB
Formato Adobe PDF
321.12 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/526397
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact