The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research.

The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research.

(2020). Ensemble methods for ranking data with and without position weights.

Ensemble methods for ranking data with and without position weights

BUSCEMI, Simona
2020-02-17

Abstract

The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research.
17-feb-2020
The main goal of this Thesis is to build suitable Ensemble Methods for ranking data with weights assigned to the items’positions, in the cases of rankings with and without ties. The Thesis begins with the definition of a new rank correlation coefficient, able to take into account the importance of items’position. Inspired by the rank correlation coefficient, τ x , proposed by Emond and Mason (2002) for unweighted rankings and the weighted Kemeny distance proposed by García-Lapresta and Pérez-Román (2010), this work proposes τ x w , a new rank correlation coefficient corresponding to the weighted Kemeny distance. The new coefficient is analized analitically and empirically and represents the main core of the consensus ranking process. Simulations and applications to real cases are presented. In a second step, in order to detect which predictors better explain a phenomenon, the Thesis proposes decision trees for ranking data with and without weights, discussing and comparing the results. A simulation study is built up, showing the impact of different structures of weights on the ability of decision trees to describe data. In the third part, ensemble methods for ranking data, more specifically Bagging and Boosting, are introduced. Last but not least, a review on a different topic is inserted in this Thesis. The review compares a significant number of linear mixed model selection procedures available in the literature. The review represents the answer to a pressing issue in the framework of LMMs: how to identify the best approach to adopt in a specific case. The work outlines mainly all approaches found in literature. This review represents my first academic training in making research.
linear mixed models; bagging; boosting; ranking data; ensemble methods; weighted Kemeny distance;
(2020). Ensemble methods for ranking data with and without position weights.
File in questo prodotto:
File Dimensione Formato  
ThesisBuscemiSimona.pdf

accesso aperto

Descrizione: Tesi di dottorato Simona Buscemi
Tipologia: Pre-print
Dimensione 1.44 MB
Formato Adobe PDF
1.44 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/395373
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact