Search results for "WordNet"
showing 10 items of 13 documents
Sense Equivalence in plWordNet to Princeton WordNet Mapping
2019
Abstract Though the interest in use of wordnets for lexicography is (gradually) growing, no research has been conducted so far on equivalence between lexical units (or senses) in inter-linked wordnets. In this paper, we present and validate a procedure of sense-linking between plWordNet and Princeton WordNet. The proposed procedure employs a continuum of three equivalence types: strong, regular and weak, distinguished by a custom-designed set of formal, semantic and translational features. To validate the procedure, three independent samples of 120 sense pairs were manually analysed with respect to the features. The results show that synsets from the two wordnets linked by interlingual syno…
Word sense disamibiguation combining conceptual distance, frequency and gloss
2004
Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…
Sub-symbolic Encoding of Words
2003
A new methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the WordNet lexical database and an ad hoc modified Sammon algorithm to associate a vector to each word in a semantic n-space. All words have been grouped according to the WordNet lexicographers’ files classification criteria: these groups have been called lexical sets. The word vector is composed by two parts: the first one, takes into account the belonging of the word to one of these lexical sets; the second one is related to the meaning of the word and it is responsible for distinguishing the word among the other ones of the same lexical set. The application of the proposed technique over all…
Wordnet and semidiscrete decomposition for sub-symbolic representation of words
2009
A methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the standard, semantically highly-structured WordNet lexical database and the SemiDiscrete matrix Decomposition to obtain a vector representation with low memory requirements in a semantic n-space. The application of the proposed algorithm over all the WordNet words would lead to a useful tool for the sub-symbolic processing of texts.
A Novel Approach to Improve the Accuracy of Web Retrieval
2010
General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each nou…
Mapping wordnets from the perspective of inter-lingual equivalence
2017
Mapping wordnets from the perspective of inter-lingual equivalence This paper explores inter-lingual equivalence from the perspective of linking two large lexico-semantic databases, namely the Princeton WordNet of English and the plWordnet ( pl. Slowosiec ) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesaurus. The development of wordnets for different languages has given rise to many wordnet linking projects (e.g. EuroWordNet, Vossen, 2002). Regardless of a linking method used, these projects require defining rules for establishing equivalence links between wordnet building bloc…
Humorist Bot: Bringing Computational Humour in a Chat-Bot System
2008
A conversational agent, capable to have a ldquosense of humourrdquo is presented. The agent can both generate humorous sentences and recognize humoristic expressions introduced by the user during the dialogue. Humorist Bot makes use of well founded techniques of computational humor and it has been implemented using the ALICE framework embedded into an Yahoo! Messenger client. It includes also an avatar that changes the face expression according to humoristic content of the dialogue.
Sub-Symbolic Knowledge Representation for Evocative Chat-Bots
2008
A sub-symbolic knowledge representation oriented to the enhancement of chat bot interaction is proposed. The result of the technique is the introduction of a semantic sub-symbolic layer to a traditional ontology-based knowledge representation. This layer is obtained mapping the ontology concepts into a semantic space built through Latent Semantic Analysis (LSA) technique and it is embedded into a conversational agent. This choice leads to a chat-bot with “evocative” capabilities whose knowledge representation framework is composed of two areas: the rational and the evocative one. As a standard ontology we have chosen the well-founded WordNet lexical dictionary, while as chat-bot the ALICE a…
Graph-based exploration and clustering analysis of semantic spaces
2019
Abstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of words, which is “learnt” from large text corpora (Google news, Amazon reviews), and “human built” word networks derived from the well-known lexical databases: WordNet and Moby Thesaurus. We compare “global” (e.g., degrees, distances, clustering coefficients) and “local” (e.g., most central nodes and community-type dense clusters) characteristics of considered networks. Our observations suggest that …
Programmrīku izstrāde Tēzaurs.lv sasaistei ar saistīto atvērto lingvistisko datu mākoni
2019
Darba ietvaros izstrādātie programmrīki ļauj valodniekiem ērti sasaistīt latviešu valodas leksisko datubāzi Tēzaurs.lv ar saistīto atvērto lingvistisko datu (LLOD) mākonī esošo leksisko datubāzi WordNet, pievienojot latviešu valodas atbalstu visiem LLOD mākonī jau saistītajiem resursiem. Darbā aprakstītais produkts sastāv no 2 programmrīkiem — lietotnes sastatījuma veidošanai un WordNet pārlūka, kas tajā ērtības labad ir integrēts.