Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

The Longest Common Factor (LCF) of a set of strings is a well studied problem having a wide range of applications in Bioinformatics: from microarrays to DNA sequences analysis. This problem has been solved by Hui (2000) who uses a famous constant-time solution to the Lowest Common Ancestor (LCA) problem in trees coupled with use of suffix trees. A data structure for the LCA problem, although linear in space and construction time, introduces a multiplicative constant in both space and time that reduces the range of applications in many biological applications. In this article we present a new method for solving the LCF problem using the suffix tree structure with an auxiliary array that take space O(n). Our algorithm works in time O(nlog a), where n is the total input size and a is the multiplicative constant introduced by the alphabet. a is the size of the alphabet. We also consider a different version of our algorithm that applies to DAWGs. In this case, we prove that the algorithm works in both time and space proportional to data DAWG’s size.

CROCHEMORE M, GABRIELE A, MIGNOSI F, PESARESI M (2008). On the Longest Common Factor Problem. In IFIP International Federation for Information Processing (pp.143-155). Springer Boston [10.1007/978-0-387-09680-3_10].

On the Longest Common Factor Problem

CROCHEMORE M;GABRIELE, Alessandra;MIGNOSI, Filippo;PESARESI M.

2008-01-01

Abstract

The Longest Common Factor (LCF) of a set of strings is a well studied problem having a wide range of applications in Bioinformatics: from microarrays to DNA sequences analysis. This problem has been solved by Hui (2000) who uses a famous constant-time solution to the Lowest Common Ancestor (LCA) problem in trees coupled with use of suffix trees. A data structure for the LCA problem, although linear in space and construction time, introduces a multiplicative constant in both space and time that reduces the range of applications in many biological applications. In this article we present a new method for solving the LCF problem using the suffix tree structure with an auxiliary array that take space O(n). Our algorithm works in time O(nlog a), where n is the total input size and a is the multiplicative constant introduced by the alphabet. a is the size of the alphabet. We also consider a different version of our algorithm that applies to DAWGs. In this case, we prove that the algorithm works in both time and space proportional to data DAWG’s size.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno 
DATO PREVISTO SU LOGINMIUR
	
				Fifth Ifip International Conference On Theoretical Computer Science – Tcs 2008
			
	Data
	
				2008
			
	Numero complessivo di pagine 
DATO PREVISTO SU LOGINMIUR
	
				13
			
	DOI dell'intervento 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1007/978-0-387-09680-3_10
			
	Citazione
	
				CROCHEMORE M,  GABRIELE A,  MIGNOSI F,  PESARESI M (2008). On the Longest Common Factor Problem. In IFIP International Federation for Information Processing (pp.143-155). Springer Boston [10.1007/978-0-387-09680-3_10].
			
	Tipologia
	
				Proceedings (atti dei congressi)
			
	Tutti gli autori
	
						CROCHEMORE M; GABRIELE A; MIGNOSI F; PESARESI M
					
	Appare nelle tipologie:
	
				Proceedings

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/39788

Citazioni

ND

0

0

social impact