In healthcare domain it can be useful to compare unstructured free-text clinical reports in order to enable the search for similar and/or relevant clinical cases. In data mining and text analysis tasks, the cosine similarity is usually used for texts comparison purposes. It is usually performed by computing the standard document vector cosine similarity between the two vectors representing the report pair under analysis. In this paper a novel system based on text pre-processing techniques and a modelled medical knowledge, using an improved radiological ontology, is proposed. Medical terms organized in a hierarchical tree can assess semantic similarity relationships between unstructured report concepts. The proposed retrieval system has been tested on a dataset composed of 126 unstructured mammographic reports written in Italian language, randomly extracted from the available reports in the Radiological Information System of the University of Palermo Policlinico Hospital. The ontology is composed of 731 concepts and it has been developed and enhanced with the collaboration of breast imaging expert radiologists. The proposed system computes the cosine similarity exploiting semantic vectors, adding the "is-a" and "equivalent-to" relationships to the enhanced ontology. It shows great improvements if compared against a classical syntactic method, giving a Sensitivity rise of +45,27%.
Comelli, A., Agnello, L., Vitabile, S. (2015). An Ontology-Based Retrieval System for Mammographic Reports. In Proceedings of The Twentieth IEEE Symposium on Computers and Communications (ISCC 2015). IEEE [10.1109/ISCC.2015.7405644].
An Ontology-Based Retrieval System for Mammographic Reports
COMELLI, Albert;AGNELLO, Luca;VITABILE, Salvatore
2015-01-01
Abstract
In healthcare domain it can be useful to compare unstructured free-text clinical reports in order to enable the search for similar and/or relevant clinical cases. In data mining and text analysis tasks, the cosine similarity is usually used for texts comparison purposes. It is usually performed by computing the standard document vector cosine similarity between the two vectors representing the report pair under analysis. In this paper a novel system based on text pre-processing techniques and a modelled medical knowledge, using an improved radiological ontology, is proposed. Medical terms organized in a hierarchical tree can assess semantic similarity relationships between unstructured report concepts. The proposed retrieval system has been tested on a dataset composed of 126 unstructured mammographic reports written in Italian language, randomly extracted from the available reports in the Radiological Information System of the University of Palermo Policlinico Hospital. The ontology is composed of 731 concepts and it has been developed and enhanced with the collaboration of breast imaging expert radiologists. The proposed system computes the cosine similarity exploiting semantic vectors, adding the "is-a" and "equivalent-to" relationships to the enhanced ontology. It shows great improvements if compared against a classical syntactic method, giving a Sensitivity rise of +45,27%.File | Dimensione | Formato | |
---|---|---|---|
07405644.pdf
Solo gestori archvio
Dimensione
1.31 MB
Formato
Adobe PDF
|
1.31 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
07405443.pdf
Solo gestori archvio
Dimensione
279.31 kB
Formato
Adobe PDF
|
279.31 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.