Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we give a general method for comparing sequences that takes into account how much the characters coming from the different input sequences are mixed in the output string. Such a method is tested on a real data set for the whole mitochondrial genome phylogeny problem. However, the goal of this paper is to introduce a new and general methodology for automatic categorization of sequences.

MANTACI, S., RESTIVO, A., ROSONE, G., SCIORTINO, M. (2008). A New Combinatorial Approach to Sequence Comparison. THEORY OF COMPUTING SYSTEMS, 42(3), 411-429 [10.1007/s00224-007-9078-6].

A New Combinatorial Approach to Sequence Comparison

MANTACI, Sabrina;RESTIVO, Antonio;ROSONE, Giovanna;SCIORTINO, Marinella

2008-01-01

Abstract

In this paper we introduce a new alignment-free method for comparing sequences which is combinatorial by nature and does not use any compressor nor any information-theoretic notion. Such a method is based on an extension of the Burrows-Wheeler Transform, a transformation widely used in the context of Data Compression. The new extended transformation takes as input a multiset of sequences and produces as output a string obtained by a suitable rearrangement of the characters of all the input sequences. By using such a transformation we give a general method for comparing sequences that takes into account how much the characters coming from the different input sequences are mixed in the output string. Such a method is tested on a real data set for the whole mitochondrial genome phylogeny problem. However, the goal of this paper is to introduce a new and general methodology for automatic categorization of sequences.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2008
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				THEORY OF COMPUTING SYSTEMS
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1007/s00224-007-9078-6
			
	Citazione
	
				MANTACI, S., RESTIVO, A., ROSONE, G., SCIORTINO, M. (2008). A New Combinatorial Approach to Sequence Comparison. THEORY OF COMPUTING SYSTEMS, 42(3), 411-429 [10.1007/s00224-007-9078-6].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
MRRS_TofCS_2008.pdf Solo gestori archvio Dimensione 438.63 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	438.63 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/13319

Citazioni

ND

35

24

social impact