6533b7ddfe1ef96bd12749bd
RESEARCH PRODUCT
An ontology-based retrieval system for mammographic reports
Salvatore VitabileAlbert ComelliLuca Agnellosubject
Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInformation retrievalbusiness.industryComputer scienceOntology-based data integrationCosine similarityOntology (information science)SemanticsDomain (software engineering)Tree (data structure)Text miningMammography Reports Information Retrieval OntologySemantic similarityOntologyUpper ontologybusinessdescription
In healthcare domain it can be useful to compare unstructured free-text clinical reports in order to enable the search for similar and/or relevant clinical cases. In data mining and text analysis tasks, the cosine similarity is usually used for texts comparison purposes. It is usually performed by computing the standard document vector cosine similarity between the two vectors representing the report pair under analysis. In this paper a novel system based on text pre-processing techniques and a modelled medical knowledge, using an improved radiological ontology, is proposed. Medical terms organized in a hierarchical tree can assess semantic similarity relationships between unstructured report concepts. The proposed retrieval system has been tested on a dataset composed of 126 unstructured mammographic reports written in Italian language, randomly extracted from the available reports in the Radiological Information System of the University of Palermo Policlinico Hospital. The ontology is composed of 731 concepts and it has been developed and enhanced with the collaboration of breast imaging expert radiologists. The proposed system computes the cosine similarity exploiting semantic vectors, adding the "is-a" and "equivalent-to" relationships to the enhanced ontology. It shows great improvements if compared against a classical syntactic method, giving a Sensitivity rise of +45,27%.
year | journal | country | edition | language |
---|---|---|---|---|
2015-07-01 | 2015 IEEE Symposium on Computers and Communication (ISCC) |