6533b851fe1ef96bd12a997f

RESEARCH PRODUCT

An A* Based Semantic Tokenizer for Increasing the Performance of Semantic Applications

Arianna PipitoneMaria Carmela CampisiRoberto Pirrone

subject

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInformation retrievalComputer sciencebusiness.industrySemantic searchOntology (information science)computer.software_genreSemantic tokenizer ontology A* tree search UIMASet (abstract data type)Semantic similaritySearch algorithmSemantic computingSemantic analyticsArtificial intelligencebusinesscomputerWord (computer architecture)Natural language processing

description

Semantic Applications (SAs) makes use of ontolo- gies and their performance can depend on the syntactic labels of the modeled entities; even if several approaches have been devised to formalize ontologies, no formal approaches have been devised for naming their constituents, which look as long word concatenations without any particular separation. We present a novel semantic tokenizer that finds the sub-words through an application of the A* based search algorithm; the A* functions rely on a set of linguistic criteria and on the meta-cognitive perspective of the activity of reading.

https://doi.org/10.1109/icsc.2013.75