Label Ranking (LR) is an emerging non-standard supervised classification problem with practical applications in different research fields. The Label Ranking task aims at building preference models that learn to order a finite set of labels based on a set of predictor features. One of the most successful approaches to tackling the LR problem consists of using decision tree ensemble models, such as bagging, random forest, and boosting. However, these approaches, coming from the classical unweighted rank correlation measures, are not sensitive to label importance. Nevertheless, in many settings, failing to predict the ranking position of a highly relevant label should be considered more serious than failing to predict a negligible one. Moreover, an efficient classifier should be able to take into account the similarity between the elements to be ranked. The main contribution of this paper is to formulate, for the first time, a more flexible label ranking ensemble model which encodes the similarity structure and a measure of the individual label importance. Precisely, the proposed method consists of three item-weighted versions of the AdaBoost boosting algorithm for label ranking. The predictive performance of our proposal is investigated both through simulations and applications to three real datasets.
Albano, A., Sciandra, M., Plaia, A. (2022). A weighted distance-based approach with boosted decision trees for label ranking. EXPERT SYSTEMS WITH APPLICATIONS, 213 [10.1016/j.eswa.2022.119000].
A weighted distance-based approach with boosted decision trees for label ranking
Albano, Alessandro
;Sciandra, Mariangela;Plaia, Antonella
2022-01-01
Abstract
Label Ranking (LR) is an emerging non-standard supervised classification problem with practical applications in different research fields. The Label Ranking task aims at building preference models that learn to order a finite set of labels based on a set of predictor features. One of the most successful approaches to tackling the LR problem consists of using decision tree ensemble models, such as bagging, random forest, and boosting. However, these approaches, coming from the classical unweighted rank correlation measures, are not sensitive to label importance. Nevertheless, in many settings, failing to predict the ranking position of a highly relevant label should be considered more serious than failing to predict a negligible one. Moreover, an efficient classifier should be able to take into account the similarity between the elements to be ranked. The main contribution of this paper is to formulate, for the first time, a more flexible label ranking ensemble model which encodes the similarity structure and a measure of the individual label importance. Precisely, the proposed method consists of three item-weighted versions of the AdaBoost boosting algorithm for label ranking. The predictive performance of our proposal is investigated both through simulations and applications to three real datasets.File | Dimensione | Formato | |
---|---|---|---|
Pre_Print_Paper.pdf
Solo gestori archvio
Descrizione: Paper
Tipologia:
Pre-print
Dimensione
823.29 kB
Formato
Adobe PDF
|
823.29 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
1-s2.0-S0957417422020188-main.pdf
Solo gestori archvio
Tipologia:
Versione Editoriale
Dimensione
972.35 kB
Formato
Adobe PDF
|
972.35 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.