Search results for "SemEval"

showing 8 items of 8 documents

Word sense disamibiguation combining conceptual distance, frequency and gloss

2004

Word sense disambiguation (WSD) is the process of assigning a meaning to a word based on the context in which it occurs. The absence of sense tagged training data is a real problem for the word sense disambiguation task. We present a method for the resolution of lexical ambiguity which relies on the use of the wide-coverage noun taxonomy of WordNet and the notion of conceptual distance among concepts, captured by a conceptual density formula developed for this purpose. The formula we propose, is a generalised form of the Agirre-Rigau conceptual density measure in which many (parameterised) refinements were introduced and an exhaustive evaluation of all meaningful combinations was performed.…

Computer sciencebusiness.industryBrown CorpusWordNetcomputer.software_genreHand codingSemEvalTaxonomy (general)NounArtificial intelligenceComputational linguisticsbusinesscomputerNatural language processingNatural languageInternational Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003

researchProduct

A practical solution to the problem of automatic word sense induction

2004

Recent studies in word sense induction are based on clustering global co-occurrence vectors, i.e. vectors that reflect the overall behavior of a word in a corpus. If a word is semantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word's senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation performance we know that the context of a word is usually sufficient to determine its sense. Based on this observation…

Computer sciencebusiness.industryWord-sense inductionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Context (language use)Artificial intelligenceCluster analysiscomputer.software_genrebusinesscomputerWord (computer architecture)Natural language processingSemEvalProceedings of the ACL 2004 on Interactive poster and demonstration sessions -

researchProduct

RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy

2016

Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic…

FOS: Computer and information sciencesParsingComputer Science - Computation and LanguageComputer sciencebusiness.industry02 engineering and technologyExtension (predicate logic)computer.software_genreSemEvalSet (abstract data type)Nondeterministic algorithm020204 information systemsTest setClassifier (linguistics)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerComputation and Language (cs.CL)Natural language processingSentence

researchProduct

Rigotrio At Semeval-2017 Task 9: Combining Machine Learning And Grammar Engineering For Amr Parsing And Generation

2017

By addressing both text-to-AMR parsing and AMR-to-text generation, SemEval-2017 Task 9 established AMR as a powerful semantic interlingua. We strengthen the interlingual aspect of AMR by applying the multilingual Grammatical Framework (GF) for AMR-to-text generation. Our current rule-based GF approach completely covered only 12.3% of the test AMRs, therefore we combined it with state-of-the-art JAMR Generator to see if the combination increases or decreases the overall performance. The combined system achieved the automatic BLEU score of 18.82 and the human Trueskill score of 107.2, to be compared to the plain JAMR Generator results. As for AMR parsing, we added NER extensions to our SemEva…

InterlinguaGenerator (computer programming)Parsingbusiness.industryComputer scienceSpeech recognitionGrammatical Framework02 engineering and technologycomputer.software_genreComputer scienceSemEvallanguage.human_languageTask (project management)020204 information systems0202 electrical engineering electronic engineering information engineeringlanguage020201 artificial intelligence & image processingGrammar engineeringArtificial intelligencebusinesscomputerNatural language processingBLEU

researchProduct

Extraction of Medical Terms for Word Sense Disambiguation within Multilingual Framework

2016

All the languages belonging to the same language family have a certain number of the common characteristics called language pair phenomena, which can be found quite useful for processing them for multilingual purposes like translation across the cognate languages, building dictionaries, thesauri, transcript collections, or for multilingual text retrieval of digital documents. In addition, it is estimated that more than 30% of English vocabulary has been inherited from Latin, which has dominated medical terminology in particular. We use this fact by exploring word sense disambiguation (WSD) in multilingual environment. Specifically in the medical domain, language pair phenomena can be limite…

Medical terminologybusiness.industryComputer sciencesimilarity metricsContext (language use)02 engineering and technologycomputer.software_genreSemEvalTerminologycomputational linguisticsmultilingual information retrievalword sense disambiguation020204 information systemsSimilarity (psychology)0202 electrical engineering electronic engineering information engineeringmedical informatics020201 artificial intelligence & image processingCognateArtificial intelligenceinformation extractionLanguage familybusinesscomputerNatural language processingWord (computer architecture)

researchProduct

Riga: from FrameNet to Semantic Frames with C6.0 Rules

2015

For the purposes of SemEval-2015 Task-18 on the semantic dependency parsing we combined the best-performing closed track approach from the SemEval-2014 competition with state-of-the-art techniques for FrameNet semantic parsing. In the closed track our system ranked third for the semantic graph accuracy and first for exact labeled match of complete semantic graphs. These results can be attributed to the high accuracy of the C6.0 rule-based sense labeler adapted from the FrameNet parser. To handle large SemEval training data the C6.0 algorithm was extended to provide multi-class classification and to use fast greedy search without significant accuracy loss compared to exhaustive search. A met…

ParsingComputer sciencebusiness.industryArtificial intelligenceFrameNetcomputer.software_genrebusinesscomputerNatural language processingSemEvalGraphProceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

researchProduct

A word prediction methodology for automatic sentence completion

2015

Word prediction generally relies on n-grams occurrence statistics, which may have huge data storage requirements and does not take into account the general meaning of the text. We propose an alternative methodology, based on Latent Semantic Analysis, to address these issues. An asymmetric Word-Word frequency matrix is employed to achieve higher scalability with large training datasets than the classic Word-Document approach. We propose a function for scoring candidate terms for the missing word in a sentence. We show how this function approximates the probability of occurrence of a given candidate word. Experimental results show that the proposed approach outperforms non neural network lang…

business.industryLatent semantic analysisComputer scienceSentence completionComputer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing)Statistical semanticsMachine learningcomputer.software_genreSemanticsSemEvalSentence completion testsword space modelLSAScalabilitylanguage modellatent semantic analysisArtificial intelligencebusinesscomputerComputer Science::Formal Languages and Automata TheoryNatural language processingSentenceWord (computer architecture)word predictionProceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)

researchProduct

Patronizējošu un nosodošu tekstu noteikšana(SemEval 2022 uzdevums)

2022

Darbā tiek apskatīta problēma par nosodošu tekstu noteikšanu, kas ir viens no SemEval 2022 izvirzītajiem uzdevumiem. Tiek apskatīti teksti angļu valodā uz jau gatavas un novērtētas datu kopas. Darba mērķis ir apskatīt iespējamus risinājumus un no izvēlētajiem izstrādāt programmu izmantojot dažādus valodas tehnoloģijas modeļus kuri spētu nolasīt doto datu kopu un atgriezt paredzējumu vai teksts ir nosodošs vai nav, kā arī apkopot informāciju par dažādiem sistēmas uzbūves slāņiem un to darbību. Galvenie modeļi, kuri darbā tiek apskatīti, realizēti un testēti ir BERT, RoBERTA un distilBERT, kā arī Naive Bayes modelis, kurš kalpo kā bāzlīnija salīdzināšanai. Beigās tiek iegūti rezultāti ar katr…

nosodoši tekstiDatorzinātnetransformatorimašīnmācīšanāsBERTSemEval

researchProduct