This paper presents a technique aimed to extract structured information from unstructured Wikipedia contents related to a particular topic, and to arrange it in a semantic way inside an ontology. The general framework is the design of an artificial agent able to deliberate when increasing its domain knowledge. In particular, this cognitive agent acts as a dialogue manager in an Intelligent Tutoring System (ITS) already presented by the authors. Our approach is based on the definition of useful patterns able to extract and identify novel concepts and relations to be added to the knowledge base. We propose a method that uses information from the wiki page’s structure. We define different strategies to obtain new concepts, and relations according to the different parts of the page. Each page is processed also as regards the text in each section. Structure analysis allows the system to extract concepts and their general relations, while text analysis is useful to devise the type of each relation to be incorporated in the domain ontology.
Pirrone, R., Pipitone, A., Russo, G. (2010). Semantic sense extraction from Wikipedia pages. In 3rd International Conference on Human System Interaction (pp.543-547). Los Alamitos, CA : IEEE Computer Society [10.1109/HSI.2010.5514514].
Semantic sense extraction from Wikipedia pages
PIRRONE, Roberto;PIPITONE, Arianna;RUSSO, Giuseppe
2010-01-01
Abstract
This paper presents a technique aimed to extract structured information from unstructured Wikipedia contents related to a particular topic, and to arrange it in a semantic way inside an ontology. The general framework is the design of an artificial agent able to deliberate when increasing its domain knowledge. In particular, this cognitive agent acts as a dialogue manager in an Intelligent Tutoring System (ITS) already presented by the authors. Our approach is based on the definition of useful patterns able to extract and identify novel concepts and relations to be added to the knowledge base. We propose a method that uses information from the wiki page’s structure. We define different strategies to obtain new concepts, and relations according to the different parts of the page. Each page is processed also as regards the text in each section. Structure analysis allows the system to extract concepts and their general relations, while text analysis is useful to devise the type of each relation to be incorporated in the domain ontology.File | Dimensione | Formato | |
---|---|---|---|
HSI10_TemplateWord_2_revised_RP.pdf
Solo gestori archvio
Descrizione: Articolo principale
Dimensione
254.53 kB
Formato
Adobe PDF
|
254.53 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.