The paper presents an ensemble method for text classification in the presence of multiple rare classes in the context of medical record data. Specifically, our study aims to classify clinical notes into multiple disease categories, including rare diseases. The Ensemble method involves combining the predictions of multiple machine learning models to predict the patient's diagnosis more accurately. We used three different machine learning algorithms, namely Support Vector Machine, Random Forest, and Naive Bayes, to generate three distinct models and combine their predictions through an ensemble method. The results demonstrate that the ensemble method improves the classification performance compared to individual models. We evaluated this approach on a dataset of 50,000 clinical notes with multiple rare classes.

Alessandro Albano , Mariangela Sciandra , Antonella Plaia (2023). Ensemble method for Text Classification in medicine with multiple rare classes. In Book of abstracts and short papers 14th Scientific Meeting of the Classification and Data Analysis Group.

Ensemble method for Text Classification in medicine with multiple rare classes

Alessandro Albano;Mariangela Sciandra;Antonella Plaia
2023-01-01

Abstract

The paper presents an ensemble method for text classification in the presence of multiple rare classes in the context of medical record data. Specifically, our study aims to classify clinical notes into multiple disease categories, including rare diseases. The Ensemble method involves combining the predictions of multiple machine learning models to predict the patient's diagnosis more accurately. We used three different machine learning algorithms, namely Support Vector Machine, Random Forest, and Naive Bayes, to generate three distinct models and combine their predictions through an ensemble method. The results demonstrate that the ensemble method improves the classification performance compared to individual models. We evaluated this approach on a dataset of 50,000 clinical notes with multiple rare classes.
2023
Settore SECS-S/01 - Statistica
9788891935632
Alessandro Albano , Mariangela Sciandra , Antonella Plaia (2023). Ensemble method for Text Classification in medicine with multiple rare classes. In Book of abstracts and short papers 14th Scientific Meeting of the Classification and Data Analysis Group.
File in questo prodotto:
File Dimensione Formato  
Cladag_TextMining.pdf

Solo gestori archvio

Descrizione: paper
Tipologia: Versione Editoriale
Dimensione 593.76 kB
Formato Adobe PDF
593.76 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/611173
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact