Search results for "Machine Translation"
showing 10 items of 64 documents
La sufixació apreciativa del català: creacions lèxiques i implicacions morfològiques
2015
Resum: La situació de la derivació apreciativa dins la morfologia és excepcional, perquè la informació limitada que contenen, en general, les gramàtiques i els diccionaris sobre els apreciatius així com l’ús habitual d’aquestes formes en contextos familiars i informals fan que la creativitat dels parlants no es vegi limitada per normatives i que, per tant, aflorin solucions imaginatives, que, tot i la variació existent, segueixen uns condicionaments gramaticals clars. L’objectiu d’aquest treball és aportar noves dades en aquest camp a partir de formes recollides en corpus tradicionals i per internet, amb una doble finalitat: d’una banda, aprofundir en les particularitats morfològiques de la…
Alternatīvas neironu mašīntulkošanas arhitektūras
2020
Pētījuma mērķis: Izpētīt alternatīvas mašīntulkošanas arhitektūras, pielietojot jaunas pieejas, lai izveidotu angļu-latviešu tulkotāju un apvienot attēlu ar teikumu matricu vienā modelī. Nozīmīgākie rezultāti: Darba ietvaros tika uztrēnēti dažādi mašīntulkotāji. Viens balstās uz «transformer» arhitektūru. Šis modeļis uztrenēts pilnīgi no jauna. Otrs mašīntulkotājs balstās uz pretrenētiem modeliem. Darbā tika salīdzinātas dažādas pieejas un tika izvēlēts labākais pretrenēts modelis XLM-R, uz kura bāzes izveidots tulkotājs. Pirmajā tulkotājā iegūtie tulkojuma rezultāti ir pietiekami precīzi un gramatiski pareizi. Taču šis tulkotājs pielāgots politiski vai juridiski virzītam tekstam, jo uztren…
Comparing Translation and Post-editing: An Annotation Schema for Activity Units
2016
The current chapter introduces an annotation schema of TPR data that categorises post-editing behaviour into five different classes and compares general-language and domain-specific English-to-German translation and post-editing with respect to production times, key-logging (text production activity and text elimination activity) and eye-tracking data (total reading times on source text and on target text). The results support the hypothesis that post-editing is faster than translation from scratch for both domain-specific and non-domain-specific text types. When key-logging and eye-tracking data are taken into consideration, domain-specific texts require more effort when translating from s…
La linguistique des grammaires françaises publiées en Espagne dans la première moitié du XIXe siècle
2005
RésuméDans cet article, nous examinons un corpus de 13 grammaires pour l’enseignement du français aux Espagnols, éditées dans la première moitié du XIXe siècle. Nous prenons en compte, dans une analyse de type transversal, (1) les sources citées par les auteurs ; (2) la nature de la définition de la grammaire et le nombre des parties du discours ; (3) la définition du nom (avec la présence ou non du schéma canonique de la déclinaison ou des classes spécifiques de cet élément) ; (4) la définition du verbe avec la présence ou non des catégories canoniques) ; et (5) la syntaxe. Notre objectif est de déterminer la linguistique explicite et implicite de ces grammaires scolaires, à une époque où …
Why Translation Is Difficult
2017
The paper develops a definition of translation literality that is based on the syntactic and semantic similarity of the source and the target texts. We provide theoretical and empirical evidence that absolute literal translations are easy to produce. Based on a multilingual corpus of alternative translations we investigate the effects of cross-lingual syntactic and semantic distance on translation production times and find that non-literality makes from-scratch translation and post-editing difficult. We show that statistical machine translation systems encounter even more difficulties with non-literality.
Particle Swarm Optimization as a New Measure of Machine Translation Efficiency
2018
The present work proposes a new approach to measuring efficiency of evolutionary algorithm-based Machine Translation. We implement some attributes of evolutionary algorithms performing cosine similarity objective function of a Particle Swarm Optimization (PSO) algorithm then, we evaluate an English text set for translation precision into the Spanish text as a simulated benchmark, and explore the backward process. Our results show that PSO algorithm can be used for translation of multiple language sentences with one identifier only, in other words the technology presented is language-pair independent. Specifically, we indicate that our cosine similarity objective function improves the veloci…
Data Augmentation for Pipeline-Based Speech Translation
2020
International audience; Pipeline-based speech translation methods may suffer from errors found in speech recognition system output. Therefore, it is crucial that machine translation systems are trained to be robust against such noise. In this paper, we propose two methods for parallel data augmentation for pipeline-based speech translation system development. The first method utilises a speech processing workflow to introduce errors and the second method generates commonly found suffix errors using a rule-based method. We show that the methods in combination allow significantly improving speech translation quality by 1.87 BLEU points over a baseline system.
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.
Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation
2020
Neural machine translation systems typically are trained on curated corpora and break when faced with non-standard orthography or punctuation. Resilience to spelling mistakes and typos, however, is crucial as machine translation systems are used to translate texts of informal origins, such as chat conversations, social media posts and web pages. We propose a simple generative noise model to generate adversarial examples of ten different types. We use these to augment machine translation systems’ training data and show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data, while baseline systems’ performance drops by…
Source-Target Mapping Model of Streaming Data Flow for Machine Translation
2017
Streaming information flow allows identification of linguistic similarities between language pairs in real time as it relies on pattern recognition of grammar rules, semantics and pronunciation especially when analyzing so called international terms, syntax of the language family as well as tenses transitivity between the languages. Overall, it provides a backbone translation knowledge for building automatic translation system that facilitates processing any of various abstract entities which combine to specify underlying phonological, morphological, semantic and syntactic properties of linguistic forms and that act as the targets of linguistic rules and operations in a source language foll…