Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particularly difficult. This paper highlights new findings that seem to address a few relevant problems in each of the three mentioned steps, both in regard to the intrinsic predictive power of methods and algorithms and their time performance. Inclusion of this latter aspect into the evaluation process is quite novel, since it is hardly considered in genomic data analysis.

Giancarlo, R., Lo Bosco, G., Pinello, L., Utro, F. (2011). The Three Steps of Clustering in the Post-Genomic Era: A Synopsis. In R. Riccardo Rizzo, P. Lisboa (a cura di), Computational Intelligence Methods for Bioinformatics and Biostatistics,7th International Meeting, CIBIB 2010, Palermo, Italy, September 16-18, 2010, Revised Selected Papers (pp. 13-30). HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY : SPRINGER-VERLAG BERLIN [10.1007/978-3-642-21946-7_2].

The Three Steps of Clustering in the Post-Genomic Era: A Synopsis

Giancarlo, R;Lo Bosco, G;Pinello, L;Utro, F

2011-01-01

Abstract

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. Following Handl et al., it can be summarized as a three step process: (a) choice of a distance function; (b) choice of a clustering algorithm; (c) choice of a validation method. Although such a purist approach to clustering is hardly seen in many areas of science, genomic data require that level of attention, if inferences made from cluster analysis have to be of some relevance to biomedical research. Unfortunately, the high dimensionality of the data and their noisy nature makes cluster analysis of genomic data particularly difficult. This paper highlights new findings that seem to address a few relevant problems in each of the three mentioned steps, both in regard to the intrinsic predictive power of methods and algorithms and their time performance. Inclusion of this latter aspect into the evaluation process is quite novel, since it is hardly considered in genomic data analysis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2011
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-3-642-21945-0
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1007/978-3-642-21946-7_2
			
	URL dell'editore (Open access ove possibile)
	
				https://link.springer.com/chapter/10.1007/978-3-642-21946-7_2
			
	Citazione
	
				Giancarlo, R., Lo Bosco, G., Pinello, L., Utro, F. (2011). The Three Steps of Clustering in the Post-Genomic Era: A Synopsis. In R. Riccardo Rizzo, P. Lisboa (a cura di), Computational Intelligence Methods for Bioinformatics and Biostatistics,7th International Meeting, CIBIB 2010, Palermo, Italy, September 16-18, 2010, Revised Selected Papers (pp. 13-30). HEIDELBERGER PLATZ 3, D-14197 BERLIN, GERMANY : SPRINGER-VERLAG BERLIN [10.1007/978-3-642-21946-7_2].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
Giancarlo et al. - 2011 -The Three Steps of Clustering in the Post-Genomic Era A Synopsis.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 409.03 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	409.03 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/618378

Citazioni

ND

17

14

social impact