Search results for " language processing"
showing 10 items of 415 documents
Constructions-and-frames analysis of translations
2013
Translation can generally be seen as a task in which the meaning of the original should be preserved as far as possible. This paper formulates the preservation of meaning in terms of theprimacy of the framehypothesis: ideally, the frame of the original is matched by the frame of the translation. I investigate one factor overriding this principle in translations between English and German through the examination of two grammatical constructions, one in English, one in German, which are not commonly available in the other language. Picking a construction comparable in function in the target language leads to frame shifts. In addition to highlighting the interplay between construction and fram…
Optimal reciprocals in German Sign Language
2003
Unlike most spoken languages, German Sign Language (DGS) does not have a single means of reciprocal marking. Rather, different strategies are used, which crucially depend on phonological (one-handed sign vs. two-handed sign) and morphosyntactic (plain verb vs. agreement verb) properties of the underlying verb. Moreover, with plain verbs DGS shows dialectal variation. Altogether there are four different ways of realizing reciprocal marking in DGS. In this paper, we compare a rule-based analysis for the reciprocal data (based on Brentari’s 1998 feature hierarchy) to an optimality-theoretic analysis. We argue that an OT-account allows for a more straightforward explanation of the facts. In par…
Phrase Frames as an Exploratory Tool for Studying English-to-Polish Translation Patterns: A Descriptive Corpus-Based Study
2020
Designed as a proof-of-concept, this descriptive corpus-based study focuses on the concept of phrase frame, defined as a contiguous sequence of n words identical except for one (Fletcher 2002). Although phrase frames were already used as a means of exploring pattern variability across and within different text types or registers written in English, they have been rarely, if ever, employed so far as a unit of analysis in descriptive research on translation. In this study, we use the English‒Polish parallel corpus Paralela (Pęzik 2016) to identify and describe Polish translation patterns that emerge from two functionally-defined English phrase frames (it is * clear that, it is * difficult to ). …
MLU and IPSyn measuring absolute complexity
2009
This article compares the results of Mean Length of Utterance (MLU) and Index of Productive Syntax (IPSyn) with the structural complexity of spontaneous utterances produced by 30-month-old Finnish children in a semi-structured playing situation. The comparison was carried out in order to determine the aspects of structural complexity which can be detected with MLU and IPSyn. This research adopts the frameworks of absolute complexity together with a multidimensional view of utterance structure and, furthermore, applies it through Utterance Analysis (UA). The results of the comparison between the metrics and changes in structural complexity discovered by UA reveal that MLU and IPSyn do functi…
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.
Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation
2020
Neural machine translation systems typically are trained on curated corpora and break when faced with non-standard orthography or punctuation. Resilience to spelling mistakes and typos, however, is crucial as machine translation systems are used to translate texts of informal origins, such as chat conversations, social media posts and web pages. We propose a simple generative noise model to generate adversarial examples of ten different types. We use these to augment machine translation systems’ training data and show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data, while baseline systems’ performance drops by…
Source-Target Mapping Model of Streaming Data Flow for Machine Translation
2017
Streaming information flow allows identification of linguistic similarities between language pairs in real time as it relies on pattern recognition of grammar rules, semantics and pronunciation especially when analyzing so called international terms, syntax of the language family as well as tenses transitivity between the languages. Overall, it provides a backbone translation knowledge for building automatic translation system that facilitates processing any of various abstract entities which combine to specify underlying phonological, morphological, semantic and syntactic properties of linguistic forms and that act as the targets of linguistic rules and operations in a source language foll…
Translingual text mining for identification of language pair phenomena
2016
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges of this computational technology. In particular, the focus will be made on identification of language pair phenomena and their applications to building holistic language model which is a novel tool for processing machine translation, supporting professional translations, evaluation of tran…
Outline for a Relevance Theoretical Model of Machine Translation Post-editing
2018
Translation process research (TPR) has advanced in the recent years to a state which allows us to study “in great detail what source and target text units are being processed, at a given point in time, to investigate what steps are involved in this process, what segments are read and aligned and how this whole process is monitored” (Alves 2015, p. 32). We have sophisticated statistical methods and with the powerful tools to produce a better and more detailed understanding of the underlying cognitive processes that are involved in translation. Following Jakobsen (2011), who suspects that we may soon be in a situation which allows us to develop a computational model of human translation, Alve…
Monolingual and cross-lingual intent detection without training data in target languages
2021
Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages. This research is based on assumptions that (1) training data can be obtained by the machine translating it from another language