Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Background: The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population.Materials and methods: Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control population n = 6828, with Fibroscan values E < 9.7 KPa; target population n = 437 with Fibroscan values E >= 9.7 KPa), we set up an SVM algorithm able to discriminate for individuals with liver fibrosis among the general US population. The algorithm set up involved the removal of missing data and a sampling optimization step to managing the data imbalance (only similar to 5 % of the dataset is the target population).Results: For the feature selection, we performed an unbiased analysis, starting from 33 clinical, anthropometric, and biochemical parameters regardless of their previous application as biomarkers of liver diseases. Through PCA analysis, we identified the 26 more significant features and then used them to set up a sampling method on an SVM algorithm. The best sampling technique to manage the data imbalance was found to be oversampling through the SMOTE-NC. For final model validation, we utilized a subset of 300 individuals (150 with liver fibrosis and 150 controls), subtracted from the main dataset prior to sampling. Performances were evaluated on multiple independent runs.Conclusions: We provide proof of concept of an ML clinical decision support tool for liver fibrosis diagnosis in the general US population. Though the presented ML model represents at this stage only a prototype, in the future, it might be implemented and potentially applied to program broad screenings for liver fibrosis.

Hassoun S., Bruckmann C., Ciardullo S., Perseghin G., Di Gaudio F., Broccolo F. (2022). Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 170 [10.1016/j.ijmedinf.2022.104932].

Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort

Hassoun S.;Bruckmann C.;Ciardullo S.;Perseghin G.;Di Gaudio F.;Broccolo F.

2022-01-01

Abstract

Background: The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population.Materials and methods: Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control population n = 6828, with Fibroscan values E < 9.7 KPa; target population n = 437 with Fibroscan values E >= 9.7 KPa), we set up an SVM algorithm able to discriminate for individuals with liver fibrosis among the general US population. The algorithm set up involved the removal of missing data and a sampling optimization step to managing the data imbalance (only similar to 5 % of the dataset is the target population).Results: For the feature selection, we performed an unbiased analysis, starting from 33 clinical, anthropometric, and biochemical parameters regardless of their previous application as biomarkers of liver diseases. Through PCA analysis, we identified the 26 more significant features and then used them to set up a sampling method on an SVM algorithm. The best sampling technique to manage the data imbalance was found to be oversampling through the SMOTE-NC. For final model validation, we utilized a subset of 300 individuals (150 with liver fibrosis and 150 controls), subtracted from the main dataset prior to sampling. Performances were evaluated on multiple independent runs.Conclusions: We provide proof of concept of an ML clinical decision support tool for liver fibrosis diagnosis in the general US population. Though the presented ML model represents at this stage only a prototype, in the future, it might be implemented and potentially applied to program broad screenings for liver fibrosis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2022
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1016/j.ijmedinf.2022.104932
			
	Citazione
	
				Hassoun S.,  Bruckmann C.,  Ciardullo S.,  Perseghin G.,  Di Gaudio F.,  Broccolo F. (2022). Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 170 [10.1016/j.ijmedinf.2022.104932].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S1386505622002465.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 1.53 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.53 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/584194

Citazioni

2

14

11

social impact