In this work we propose a novel ensemble model based on deep learning and non-deep learning classifiers. The proposed model was developed by our team for participating at the Profiling Irony and Stereotype Spreaders (ISSs) task hosted at PAN@CLEF2022. Our ensemble (named T100), include a Logistic Regressor (LR) that classifies an author as ISS or not (nISS) considering the predictions provided by a first stage of classifiers. All these classifiers are able to reach state-of-the-art results on several text classification tasks. These classifiers (namely, the voters) are a Convolutional Neural Network (CNN), a Support Vector Machine (SVM), a Decision Tree (DT) and a Naive Bayes (NB) classifier. The voters are trained on the provided dataset and then generate predictions on the training set. Finally, the LR is trained on the predictions made by the voters. For the simulation phase the LR considers the predictions of the voters on the unlabelled test set to provide its final prediction on each sample. To develop and test our model we used a 5-fold cross validation on the labelled training set. Over the five validation splits, the proposed model achieves a maximum accuracy of 0.9342 and an average accuracy of 0.9158. As announced by the task organizers, the trained model presented here is able to reach an accuracy of 0.9444 on the unlabelled test set provided for the task.

Siino M., Tinnirello I., La Cascia M. (2022). T100: A modern classic ensemble to profile irony and stereotype spreaders. In Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum (pp. 2666-2674). CEUR-WS.

T100: A modern classic ensemble to profile irony and stereotype spreaders

Siino M.
;
Tinnirello I.;La Cascia M.
2022-09-01

Abstract

In this work we propose a novel ensemble model based on deep learning and non-deep learning classifiers. The proposed model was developed by our team for participating at the Profiling Irony and Stereotype Spreaders (ISSs) task hosted at PAN@CLEF2022. Our ensemble (named T100), include a Logistic Regressor (LR) that classifies an author as ISS or not (nISS) considering the predictions provided by a first stage of classifiers. All these classifiers are able to reach state-of-the-art results on several text classification tasks. These classifiers (namely, the voters) are a Convolutional Neural Network (CNN), a Support Vector Machine (SVM), a Decision Tree (DT) and a Naive Bayes (NB) classifier. The voters are trained on the provided dataset and then generate predictions on the training set. Finally, the LR is trained on the predictions made by the voters. For the simulation phase the LR considers the predictions of the voters on the unlabelled test set to provide its final prediction on each sample. To develop and test our model we used a 5-fold cross validation on the labelled training set. Over the five validation splits, the proposed model achieves a maximum accuracy of 0.9342 and an average accuracy of 0.9158. As announced by the task organizers, the trained model presented here is able to reach an accuracy of 0.9444 on the unlabelled test set provided for the task.
set-2022
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
Settore ING-INF/03 - Telecomunicazioni
Siino M., Tinnirello I., La Cascia M. (2022). T100: A modern classic ensemble to profile irony and stereotype spreaders. In Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum (pp. 2666-2674). CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
paper-221.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 1.15 MB
Formato Adobe PDF
1.15 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/567982
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? ND
social impact