In this paper we study the clustering effect of the Burrows-Wheeler Transform (BWT) from a combinatorial viewpoint. In particular, given a word w we define the BWT-clustering ratio of w as the ratio between the number of clusters produced by BWT and the number of the clusters of w. The number of clusters of a word is measured by its Run-Length Encoding. We show that the BWT-clustering ratio ranges in ]0,ÃÂ 2]. Moreover, given a rational number râ]0,2], it is possible to find infinitely many words having BWT-clustering ratio equal to r. Finally, we show how the words can be classified according to their BWT-clustering ratio. The behavior of such a parameter is studied for very well-known families of binary words.
Mantaci, S., Restivo, A., Rosone, G., Sciortino, M. (2017). Burrows-Wheeler transform and Run-Length Enconding. In Combinatorics on Words, 11th International Conference, WORDS 2017, Montréal, QC, Canada, September 11–15, 2017, Proceedings (pp. 228-239). Springer Verlag [10.1007/978-3-319-66396-8_21].
Burrows-Wheeler transform and Run-Length Enconding
Mantaci, Sabrina
;Sciortino, Marinella
2017-01-01
Abstract
In this paper we study the clustering effect of the Burrows-Wheeler Transform (BWT) from a combinatorial viewpoint. In particular, given a word w we define the BWT-clustering ratio of w as the ratio between the number of clusters produced by BWT and the number of the clusters of w. The number of clusters of a word is measured by its Run-Length Encoding. We show that the BWT-clustering ratio ranges in ]0,ÃÂ 2]. Moreover, given a rational number râ]0,2], it is possible to find infinitely many words having BWT-clustering ratio equal to r. Finally, we show how the words can be classified according to their BWT-clustering ratio. The behavior of such a parameter is studied for very well-known families of binary words.File | Dimensione | Formato | |
---|---|---|---|
LNCS WORDS 2017.pdf
Solo gestori archvio
Dimensione
474.56 kB
Formato
Adobe PDF
|
474.56 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.