6533b82cfe1ef96bd128ec27

RESEARCH PRODUCT

Semantic sense extraction from Wikipedia pages

Roberto PirroneArianna PipitoneGiuseppe Russo

subject

Semantic sense extraction Knowledge Acquisition WikipediaSettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInformation retrievalRelation (database)business.industryComputer scienceOntology (information science)Knowledge acquisitionIntelligent tutoring systemWorld Wide WebKnowledge-based systemsKnowledge baseEncyclopediabusinessContent management

description

This paper presents a technique aimed to extract structured information from unstructured Wikipedia contents related to a particular topic, and to arrange it in a semantic way inside an ontology. The general framework is the design of an artificial agent able to deliberate when increasing its domain knowledge. In particular, this cognitive agent acts as a dialogue manager in an Intelligent Tutoring System (ITS) already presented by the authors. Our approach is based on the definition of useful patterns able to extract and identify novel concepts and relations to be added to the knowledge base. We propose a method that uses information from the wiki page’s structure. We define different strategies to obtain new concepts, and relations according to the different parts of the page. Each page is processed also as regards the text in each section. Structure analysis allows the system to extract concepts and their general relations, while text analysis is useful to devise the type of each relation to be incorporated in the domain ontology.

10.1109/hsi.2010.5514514http://hdl.handle.net/10447/76934