Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. Although both techniques have been evaluated in the realm of microarray data analysis, their merits (relative to each other) has not been assessed. Here we will fill this gap in the literature by comparing three Bayesians versus several state of the art data-driven model selection methods. Our results show that, although in some cases Bayesian methods guarantee good results, they are not able to compete in terms of ability to predict the correct number of clusters in a dataset with the data-driven methods

Giancarlo, R., Lo Bosco G, Utro F (2015). Bayesian versus data driven model selection for microarray data. NATURAL COMPUTING, 14(3), 393-402 [10.1007/s11047-014-9446-5].

Bayesian versus data driven model selection for microarray data

GIANCARLO, Raffaele;LO BOSCO, Giosue';Utro F.

2015-01-01

Abstract

Clustering is one of the most well known activities in scientific investigation and the object of research in many disciplines, ranging from Statistics to Computer Science. In this beautiful area, one of the most difficult challenges is a particular instance of the model selection problem, i.e., the identification of the correct number of clusters in a dataset. In what follows, for ease of reference, we refer to that instance still as model selection. It is an important part of any statistical analysis. The techniques used for solving it are mainly either Bayesian or data-driven, and are both based on internal knowledge. That is, they use information obtained by processing the input data. Although both techniques have been evaluated in the realm of microarray data analysis, their merits (relative to each other) has not been assessed. Here we will fill this gap in the literature by comparing three Bayesians versus several state of the art data-driven model selection methods. Our results show that, although in some cases Bayesian methods guarantee good results, they are not able to compete in terms of ability to predict the correct number of clusters in a dataset with the data-driven methods

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
			2015
		
	Settore scientifico disciplinare del contributo
	
			Settore INF/01 - Informatica
		
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
			NATURAL COMPUTING
		
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
			https://dx.doi.org/10.1007/s11047-014-9446-5
		
	URL alternativo rispetto a quello dell'editore 
DATO PREVISTO SU LOGINMIUR
	
			http://link.springer.com/article/10.1007/s11047-014-9446-5
		
	Citazione
	
			Giancarlo, R.,  Lo Bosco G,  Utro F (2015). Bayesian versus data driven model selection for microarray data. NATURAL COMPUTING, 14(3), 393-402 [10.1007/s11047-014-9446-5].
		
	Appare nelle tipologie:
	
			1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Bayesian_versus_data_driven_model_selection_for_microarray_data.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 556.92 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	556.92 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
post_print_Bayesian_versus_data_driven_model_selection_for_microarray_data .pdf accesso aperto Tipologia: Post-print Dimensione 529.79 kB Formato Adobe PDF Visualizza/Apri	529.79 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/96557

Citazioni

ND

3

2

social impact