Search results for "Natural Language Processing"

showing 10 items of 413 documents

Polysemy in Controlled Natural Language Texts

2015

Computational semantics and logic-based controlled natural languages (CNL) do not address systematically the word sense disambiguation problem of content words, i.e., they tend to interpret only some functional words that are crucial for construction of discourse representation structures. We show that micro-ontologies and multi-word units allow integration of the rich and polysemous multi-domain background knowledge into CNL thus providing interpretation for the content words. The proposed approach is demonstrated by extending the Attempto Controlled English (ACE) with polysemous and procedural constructs resulting in a more natural CNL named PAO covering narrative multi-domain texts.

FOS: Computer and information sciencesComputer Science - Computation and LanguageInterpretation (logic)Computer sciencebusiness.industryRepresentation (arts)Content wordcomputer.software_genrelanguage.human_languageControlled natural languageComputational semanticslanguageAttempto Controlled EnglishArtificial intelligencePolysemybusinessComputation and Language (cs.CL)computerNatural languageNatural language processing

researchProduct

Towards the evaluation of automatic simultaneous speech translation from a communicative perspective

2021

In recent years, automatic speech-to-speech and speech-to-text translation has gained momentum thanks to advances in artificial intelligence, especially in the domains of speech recognition and machine translation. The quality of such applications is commonly tested with automatic metrics, such as BLEU, primarily with the goal of assessing improvements of releases or in the context of evaluation campaigns. However, little is known about how the output of such systems is perceived by end users or how they compare to human performances in similar communicative tasks. In this paper, we present the results of an experiment aimed at evaluating the quality of a real-time speech translation engine…

FOS: Computer and information sciencesComputer Science - Computation and LanguageMachine translationEnd userComputer sciencebusiness.industrymedia_common.quotation_subjectSample (statistics)Context (language use)Intelligibility (communication)computer.software_genreSpeech translationQuality (business)Artificial intelligencebusinessComputation and Language (cs.CL)computerInterpreterNatural language processingmedia_commonProceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

researchProduct

Facilitating terminology translation with target lemma annotations

2021

Most of the recent work on terminology integration in machine translation has assumed that terminology translations are given already inflected in forms that are suitable for the target language sentence. In day-to-day work of professional translators, however, it is seldom the case as translators work with bilingual glossaries where terms are given in their dictionary forms; finding the right target language form is part of the translation process. We argue that the requirement for apriori specified target language forms is unrealistic and impedes the practical applicability of previous work. In this work, we propose to train machine translation systems using a source-side data augmentatio…

FOS: Computer and information sciencesLemma (mathematics)Computer Science - Computation and LanguageMachine translationProcess (engineering)Computer sciencebusiness.industryLatvianTerm (logic)Translation (geometry)computer.software_genrelanguage.human_languageTerminologylanguageArtificial intelligencebusinessComputation and Language (cs.CL)computerNatural language processingSentence

researchProduct

Effectiveness of Data-Driven Induction of Semantic Spaces and Traditional Classifiers for Sarcasm Detection

2019

Irony and sarcasm are two complex linguistic phenomena that are widely used in everyday language and especially over the social media, but they represent two serious issues for automated text understanding. Many labeled corpora have been extracted from several sources to accomplish this task, and it seems that sarcasm is conveyed in different ways for different domains. Nonetheless, very little work has been done for comparing different methods among the available corpora. Furthermore, usually, each author collects and uses their own datasets to evaluate his own method. In this paper, we show that sarcasm detection can be tackled by applying classical machine learning algorithms to input te…

FOS: Computer and information sciencesLinguistics and LanguageComputer Science - Machine LearningComputer sciencemedia_common.quotation_subjectSemantic spaceMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreLanguage and LinguisticsTask (project management)Data-drivenMachine Learning (cs.LG)Artificial IntelligenceStatistics - Machine Learning020204 information systemsEveryday language0202 electrical engineering electronic engineering information engineeringSocial medianatural language processingmedia_commonComputer Science - Computation and LanguageSarcasmSettore INF/01 - Informaticabusiness.industryirony detectionIronymachine learningsemantic spaces020201 artificial intelligence & image processingArtificial intelligencebusinessIrony detectionsemantic spacecomputerComputation and Language (cs.CL)SoftwareNatural language processingsarcasm detection

researchProduct

RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy

2016

Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic…

FOS: Computer and information sciencesParsingComputer Science - Computation and LanguageComputer sciencebusiness.industry02 engineering and technologyExtension (predicate logic)computer.software_genreSemEvalSet (abstract data type)Nondeterministic algorithm020204 information systemsTest setClassifier (linguistics)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerComputation and Language (cs.CL)Natural language processingSentence

researchProduct

A Freely Available Morphological Analyzer, Disambiguator and Context Sensitive Lemmatizer for German

1998

In this paper we present Morphy, an integrated tool for German morphology, part-of-speech tagging and context-sensitive lemmatization. Its large lexicon of more than 320,000 word forms plus its ability to process German compound nouns guarantee a wide morphological coverage. Syntactic ambiguities can be resolved with a standard statistical part-of-speech tagger. By using the output of the tagger, the lemmatizer can determine the correct root even for ambiguous word forms. The complete package is freely available and can be downloaded from the World Wide Web.

FOS: Computer and information sciencesSpectrum analyzerRoot (linguistics)Morphology (linguistics)Computer Science - Computation and LanguageComputer sciencebusiness.industryLemmatisationContext (language use)computer.software_genreLexiconSyntaxlanguage.human_languageGermanH.3.4NounlanguageArtificial intelligencebusinesscomputerComputation and Language (cs.CL)Natural language processingWord (computer architecture)

researchProduct

Semantic Computing of Moods Based on Tags in Social Media of Music

2014

Social tags inherent in online music services such as Last.fm provide a rich source of information on musical moods. The abundance of social tags makes this data highly beneficial for developing techniques to manage and retrieve mood information, and enables study of the relationships between music content and mood representations with data substantially larger than that available for conventional emotion research. However, no systematic assessment has been done on the accuracy of social tags and derived semantic models at capturing mood information in music. We propose a novel technique called Affective Circumplex Transformation (ACT) for representing the moods of music tracks in an interp…

FOS: Computer and information sciencesVocabularyComputer scienceMusic information retrievalmedia_common.quotation_subjectSemantic analysis (machine learning)Moodscomputer.software_genreAffect (psychology)SemanticsComputer Science - Information RetrievalSemantic computingMusic information retrievalAffective computingmedia_commonSocial and Information Networks (cs.SI)ta113Probabilistic latent semantic analysisSocial tagsbusiness.industryComputer Science - Social and Information NetworksMultimedia (cs.MM)Semantic analysisComputer Science ApplicationsMoodComputational Theory and MathematicsWeb miningta6131Vector space modelArtificial intelligenceGenresbusinesscomputerComputer Science - MultimediaInformation Retrieval (cs.IR)MusicNatural language processingPrediction.Information SystemsIEEE Transactions on Knowledge and Data Engineering

researchProduct

Measuring Semantic Coherence of a Conversation

2018

Conversational systems have become increasingly popular as a way for humans to interact with computers. To be able to provide intelligent responses, conversational systems must correctly model the structure and semantics of a conversation. We introduce the task of measuring semantic (in)coherence in a conversation with respect to background knowledge, which relies on the identification of semantic relations between concepts introduced during a conversation. We propose and evaluate graph-based and machine learning-based approaches for measuring semantic coherence using knowledge graphs, their vector space embeddings and word embedding models, as sources of background knowledge. We demonstrat…

FOS: Computer and information sciencesWord embeddingComputer scienceComputer Science - Artificial Intelligencemedia_common.quotation_subjectihmisen ja tietokoneen vuorovaikutus02 engineering and technologycomputer.software_genrekeskustelu020204 information systems0202 electrical engineering electronic engineering information engineeringConversationconversational systemsmedia_commonComputer Science - Computation and Languagebusiness.industrykoneoppiminenArtificial Intelligence (cs.AI)Knowledge graphsemantiikkaGraph (abstract data type)020201 artificial intelligence & image processingArtificial intelligencebusinesssemantic coherencecomputerComputation and Language (cs.CL)Natural language processing

researchProduct

EHeBby: An evocative humorist chat-bot

2008

A conversational agent, capable to have a "sense of humor" is presented. The agent can both generate humorous sentences and recognize humoristic expressions introduced by the user during the dialogue. EHeBby is an entertainment oriented conversational agent implemented using the ALICE framework embedded into an Yahoo! Messenger client. It is characterized by two areas: a rational, rule-based area and an evocative area. The first one is based on well founded techniques of computational humor and a standard AIML KB. The second one is based on a conceptual space, automatically induced by a corpus of funny documents, where KB items and user sentences are mapped. This area emulates an associativ…

Facial expressionComputer Networks and CommunicationsComputer sciencebusiness.industryComputational humorTK5101-6720AIMLcomputer.software_genreconversational agent computational humor conceptual spaceComputer Science ApplicationsEntertainmentHuman–computer interactionTelecommunicationArtificial intelligenceDialog systemAlice (programming language)businesscomputerAssociative propertyNatural language processingcomputer.programming_languageAvatar

researchProduct

Effective feature descriptor-based new framework for off-line text-independent writer identification

2018

Feature engineering is a key factor of machine learning applications. It is a fundamental process in writer identification of handwriting, which is an active and challenging field of research for many years. We propose a conceptually computationally efficient, yet simple and fast local descriptor referred to as Block Wise Local Binary Count (BW-LBC) for offline text-independent writer identification of handwritten documents. Proposed BW-LBC operator, which characterizes the writing style of each writer, is applied to a set of connected components extracted and cropped from scanned handwriting samples (documents or set of words/text lines) where each labeled component is seen as a texture im…

Feature engineering0209 industrial biotechnologyComputer sciencebusiness.industryFeature vectorFeature extraction02 engineering and technologycomputer.software_genreWriting styleIdentification (information)020901 industrial engineering & automationHandwritingClassifier (linguistics)ComputingMethodologies_DOCUMENTANDTEXTPROCESSING0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerArabic scriptNatural language processing2018 International Conference on Intelligent Systems and Computer Vision (ISCV)

researchProduct