Search results for "SemEval"
showing 8 items of 8 documents
Word sense disamibiguation combining conceptual distance, frequency and gloss
2004
Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…
A practical solution to the problem of automatic word sense induction
2004
Recent studies in word sense induction are based on clustering global co-occurrence vectors, i.e. vectors that reflect the overall behavior of a word in a corpus. If a word is semantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word's senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation performance we know that the context of a word is usually sufficient to determine its sense. Based on this observation…
RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy
2016
Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic…
Rigotrio At Semeval-2017 Task 9: Combining Machine Learning And Grammar Engineering For Amr Parsing And Generation
2017
By addressing both text-to-AMR parsing and AMR-to-text generation, SemEval-2017 Task 9 established AMR as a powerful semantic interlingua. We strengthen the interlingual aspect of AMR by applying the multilingual Grammatical Framework (GF) for AMR-to-text generation. Our current rule-based GF approach completely covered only 12.3% of the test AMRs, therefore we combined it with state-of-the-art JAMR Generator to see if the combination increases or decreases the overall performance. The combined system achieved the automatic BLEU score of 18.82 and the human Trueskill score of 107.2, to be compared to the plain JAMR Generator results. As for AMR parsing, we added NER extensions to our SemEva…
Extraction of Medical Terms for Word Sense Disambiguation within Multilingual Framework
2016
All the languages belonging to the same language family have a certain number of the common characteristics called language pair phenomena, which can be found quite useful for processing them for multilingual purposes like translation across the cognate languages, building dictionaries, thesauri, transcript collections, or for multilingual text retrieval of digital documents. In addition, it is estimated that more than 30% of English vocabulary has been inherited from Latin, which has dominated medical terminology in particular. We use this fact by exploring word sense disambiguation (WSD) in multilingual environment. Specifically in the medical domain, language pair phenomena can be limite…
Riga: from FrameNet to Semantic Frames with C6.0 Rules
2015
For the purposes of SemEval-2015 Task-18 on the semantic dependency parsing we combined the best-performing closed track approach from the SemEval-2014 competition with state-of-the-art techniques for FrameNet semantic parsing. In the closed track our system ranked third for the semantic graph accuracy and first for exact labeled match of complete semantic graphs. These results can be attributed to the high accuracy of the C6.0 rule-based sense labeler adapted from the FrameNet parser. To handle large SemEval training data the C6.0 algorithm was extended to provide multi-class classification and to use fast greedy search without significant accuracy loss compared to exhaustive search. A met…
A word prediction methodology for automatic sentence completion
2015
Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…
Patronizējošu un nosodošu tekstu noteikšana(SemEval 2022 uzdevums)
2022
Darbā tiek apskatīta problēma par nosodošu tekstu noteikšanu, kas ir viens no SemEval 2022 izvirzītajiem uzdevumiem. Tiek apskatīti teksti angļu valodā uz jau gatavas un novērtētas datu kopas. Darba mērķis ir apskatīt iespējamus risinājumus un no izvēlētajiem izstrādāt programmu izmantojot dažādus valodas tehnoloģijas modeļus kuri spētu nolasīt doto datu kopu un atgriezt paredzējumu vai teksts ir nosodošs vai nav, kā arī apkopot informāciju par dažādiem sistēmas uzbūves slāņiem un to darbību. Galvenie modeļi, kuri darbā tiek apskatīti, realizēti un testēti ir BERT, RoBERTA un distilBERT, kā arī Naive Bayes modelis, kurš kalpo kā bāzlīnija salīdzināšanai. Beigās tiek iegūti rezultāti ar katr…