Search results for "Natural language processing"
showing 10 items of 413 documents
Optimal reciprocals in German Sign Language
2003
Unlike most spoken languages, German Sign Language (DGS) does not have a single means of reciprocal marking. Rather, different strategies are used, which crucially depend on phonological (one-handed sign vs. two-handed sign) and morphosyntactic (plain verb vs. agreement verb) properties of the underlying verb. Moreover, with plain verbs DGS shows dialectal variation. Altogether there are four different ways of realizing reciprocal marking in DGS. In this paper, we compare a rule-based analysis for the reciprocal data (based on Brentari’s 1998 feature hierarchy) to an optimality-theoretic analysis. We argue that an OT-account allows for a more straightforward explanation of the facts. In par…
Phrase Frames as an Exploratory Tool for Studying English-to-Polish Translation Patterns: A Descriptive Corpus-Based Study
2020
Designed as a proof-of-concept, this descriptive corpus-based study focuses on the concept of phrase frame, defined as a contiguous sequence of n words identical except for one (Fletcher 2002). Although phrase frames were already used as a means of exploring pattern variability across and within different text types or registers written in English, they have been rarely, if ever, employed so far as a unit of analysis in descriptive research on translation. In this study, we use the English‒Polish parallel corpus Paralela (Pęzik 2016) to identify and describe Polish translation patterns that emerge from two functionally-defined English phrase frames (it is * clear that, it is * difficult to ). …
MLU and IPSyn measuring absolute complexity
2009
This article compares the results of Mean Length of Utterance (MLU) and Index of Productive Syntax (IPSyn) with the structural complexity of spontaneous utterances produced by 30-month-old Finnish children in a semi-structured playing situation. The comparison was carried out in order to determine the aspects of structural complexity which can be detected with MLU and IPSyn. This research adopts the frameworks of absolute complexity together with a multidimensional view of utterance structure and, furthermore, applies it through Utterance Analysis (UA). The results of the comparison between the metrics and changes in structural complexity discovered by UA reveal that MLU and IPSyn do functi…
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.
Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation
2020
Neural machine translation systems typically are trained on curated corpora and break when faced with non-standard orthography or punctuation. Resilience to spelling mistakes and typos, however, is crucial as machine translation systems are used to translate texts of informal origins, such as chat conversations, social media posts and web pages. We propose a simple generative noise model to generate adversarial examples of ten different types. We use these to augment machine translation systems’ training data and show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data, while baseline systems’ performance drops by…
Source-Target Mapping Model of Streaming Data Flow for Machine Translation
2017
Streaming information flow allows identification of linguistic similarities between language pairs in real time as it relies on pattern recognition of grammar rules, semantics and pronunciation especially when analyzing so called international terms, syntax of the language family as well as tenses transitivity between the languages. Overall, it provides a backbone translation knowledge for building automatic translation system that facilitates processing any of various abstract entities which combine to specify underlying phonological, morphological, semantic and syntactic properties of linguistic forms and that act as the targets of linguistic rules and operations in a source language foll…
Translingual text mining for identification of language pair phenomena
2016
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges of this computational technology. In particular, the focus will be made on identification of language pair phenomena and their applications to building holistic language model which is a novel tool for processing machine translation, supporting professional translations, evaluation of tran…
Outline for a Relevance Theoretical Model of Machine Translation Post-editing
2018
Translation process research (TPR) has advanced in the recent years to a state which allows us to study “in great detail what source and target text units are being processed, at a given point in time, to investigate what steps are involved in this process, what segments are read and aligned and how this whole process is monitored” (Alves 2015, p. 32). We have sophisticated statistical methods and with the powerful tools to produce a better and more detailed understanding of the underlying cognitive processes that are involved in translation. Following Jakobsen (2011), who suspects that we may soon be in a situation which allows us to develop a computational model of human translation, Alve…
Monolingual and cross-lingual intent detection without training data in target languages
2021
Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages. This research is based on assumptions that (1) training data can be obtained by the machine translating it from another language
Rhythmic and textural musical sequences differently influence syntax and semantic processing in children.
2020
International audience; Effects of music on language processing have been reported separately for syntax and for semantics. Previous studies have shown that regular musical rhythms can facilitate syntax processing and that semantic features of musical excerpts can inZluence semantic processing of words. It remains unclear whether musical parameters, such as rhythm and sound texture, may speciZically inZluence different components of linguistic processing. In the current study, two types of musical sequences (one focusing on rhythm and the other focusing on sound texture) were presented to children who were requested to perform a syntax or a semantic task thereafter. The results revealed tha…