Search results for "wordnet"
showing 10 items of 13 documents
A Novel Approach to Improve the Accuracy of Web Retrieval
2010
General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each nou…
Mapping wordnets from the perspective of inter-lingual equivalence
2017
Mapping wordnets from the perspective of inter-lingual equivalence This paper explores inter-lingual equivalence from the perspective of linking two large lexico-semantic databases, namely the Princeton WordNet of English and the plWordnet ( pl. Slowosiec ) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesaurus. The development of wordnets for different languages has given rise to many wordnet linking projects (e.g. EuroWordNet, Vossen, 2002). Regardless of a linking method used, these projects require defining rules for establishing equivalence links between wordnet building bloc…
Towards Equivalence Links between Senses in PlWordNet and Princeton WordNet
2017
AbstractThe paper focuses on the issue of creating equivalence links in the domain of bilingual computational lexicography. The existing interlingual links between plWordNet and Princeton WordNet synsets (sets of synonymous lexical units – lemma and sense pairs) are re-analysed from the perspective of equivalence types as defined in traditional lexicography and translation. Special attention is paid to cognitive and translational equivalents. A proposal of mapping lexical units is presented. Three types of links are defined: super-strong equivalence, strong equivalence and weak implied equivalence. The strong equivalences have a common set of formal, semantic and usage features, with some o…
VEBO: Validation of E-R diagrams through ontologies and WordNet
2012
In the semantic web vision, ontologies are building blocks for providing applications with a high level description of the operating environment in support of interoperability and semantic capabilities. The importance of ontologies in this respect is clearly stated in many works. Another crucial issue to increase the semantic aspect of web is to enrich the level of expressivity of database related data. Nowadays, databases are the primary source of information for dynamical web sites. The linguistic data used to build the database structure could be relevant for extracting meaningful information. In most cases, this type of information is not used for information retrieval. The work present…
Sub-symbolic Encoding of Words
2003
A new methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the WordNet lexical database and an ad hoc modified Sammon algorithm to associate a vector to each word in a semantic n-space. All words have been grouped according to the WordNet lexicographers’ files classification criteria: these groups have been called lexical sets. The word vector is composed by two parts: the first one, takes into account the belonging of the word to one of these lexical sets; the second one is related to the meaning of the word and it is responsible for distinguishing the word among the other ones of the same lexical set. The application of the proposed technique over all…
Word sense disamibiguation combining conceptual distance, frequency and gloss
2004
Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…
Sub-Symbolic Knowledge Representation for Evocative Chat-Bots
2008
A sub-symbolic knowledge representation oriented to the enhancement of chat bot interaction is proposed. The result of the technique is the introduction of a semantic sub-symbolic layer to a traditional ontology-based knowledge representation. This layer is obtained mapping the ontology concepts into a semantic space built through Latent Semantic Analysis (LSA) technique and it is embedded into a conversational agent. This choice leads to a chat-bot with “evocative” capabilities whose knowledge representation framework is composed of two areas: the rational and the evocative one. As a standard ontology we have chosen the well-founded WordNet lexical dictionary, while as chat-bot the ALICE a…
Graph-based exploration and clustering analysis of semantic spaces
2019
Abstract The goal of this study is to demonstrate how network science and graph theory tools and concepts can be effectively used for exploring and comparing semantic spaces of word embeddings and lexical databases. Specifically, we construct semantic networks based on word2vec representation of words, which is “learnt” from large text corpora (Google news, Amazon reviews), and “human built” word networks derived from the well-known lexical databases: WordNet and Moby Thesaurus. We compare “global” (e.g., degrees, distances, clustering coefficients) and “local” (e.g., most central nodes and community-type dense clusters) characteristics of considered networks. Our observations suggest that …
Multi-word Lexical Units Recognition in WordNet
2022
WordNet is a state-of-the-art lexical resource used in many tasks in Natural Language Processing, also in multi-word expression (MWE) recognition. However, not all MWEs recorded in WordNet could be indisputably called lexicalised. Some of them are semantically compositional and show no signs of idiosyncrasy. This state of affairs affects all evaluation measures that use the list of all WordNet MWEs as a gold standard. We propose a method of distinguishing between lexicalised and non-lexicalised word combinations in WordNet, taking into account lexicality features, such as semantic compositionality, MWE length and translational criterion. Both a rule-based approach and a ridge logistic regre…
Sense Equivalence in plWordNet to Princeton WordNet Mapping
2019
Abstract Though the interest in use of wordnets for lexicography is (gradually) growing, no research has been conducted so far on equivalence between lexical units (or senses) in inter-linked wordnets. In this paper, we present and validate a procedure of sense-linking between plWordNet and Princeton WordNet. The proposed procedure employs a continuum of three equivalence types: strong, regular and weak, distinguished by a custom-designed set of formal, semantic and translational features. To validate the procedure, three independent samples of 120 sense pairs were manually analysed with respect to the features. The results show that synsets from the two wordnets linked by interlingual syno…