Search results for "machine translation"
showing 10 items of 64 documents
Does the mastery of center-embedded linguistic structures distinguish humans from nonhuman primates?
2005
In a recentScience article, Fitch and Hauser (2004; hereafter, F&H) claimed to have demonstrated that cotton-top tamarins fail to learn an artificial language produced by a phrase structure grammar (Chomsky, 1957) generating center-embedded sentences, whereas adult humans easily learn such a language. We report an experiment replicating the results of F&H in humans but also showing that subjects learned the language without exploiting in any way the center-embedded structure. When the procedure was modified to make the processing of this structure mandatory, the subjects no longer showed evidence of learning. We propose a simple interpretation for the difference in performance observed in F…
Structural Knowledge Extraction from Mobility Data
2016
Knowledge extraction has traditionally represented one of the most interesting challenges in AI; in recent years, however, the availability of large collections of data has increased the awareness that “measuring” does not seamlessly translate into “understanding”, and that more data does not entail more knowledge. We propose here a formulation of knowledge extraction in terms of Grammatical Inference (GI), an inductive process able to select the best grammar consistent with the samples. The aim is to let models emerge from data themselves, while inference is turned into a search problem in the space of consistent grammars, induced by samples, given proper generalization operators. We will …
Semi-automatic Quasi-morphological Word Segmentation for Neural Machine Translation
2018
This paper proposes the Prefix-Root-Postfix-Encoding (PRPE) algorithm, which performs close-to-morphological segmentation of words as part of text pre-processing in machine translation. PRPE is a cross-language algorithm requiring only minor tweaking to adapt it for any particular language, a property which makes it potentially useful for morphologically rich languages with no morphological analysers available. As a key part of the proposed algorithm we introduce the ‘Root alignment’ principle to extract potential sub-words from a corpus, as well as a special technique for constructing words from potential sub-words. We conducted experiments with two different neural machine translation sys…
4th International Workshop on Language Engineering (ATEM 2007)
2008
Following the great success of previous editions, ATEM2007 is the 4thedition of the ATEM workshop series. The first two editions were held with WCRE in 2003 and 2004, while the 3rdone was held with MoDELS 2006. ATEM has always been focused on engineering of language descriptions. In order to cover as many aspects of language descriptions important for greater success and adoption of model-driven engineering, ATEM has been evolving so as its scope: The first edition was about metamodelsand schemas. The second about was metamodels, schemasand grammars. The third edition was about metamodels, schemas, grammarsand ontologies.
Multi-pass execution of functional logic programs
1994
An operational semantics for functional logic programs is presented. In such programs functional terms provide for reduction of expressions, provided that they ground. The semantics is based on multi-pass evaluation techniques originally developed for attribute grammars. Program execution is divided into two phases: (1) construction of an incomplete proof tree, and (2) its decoration into a complete proof tree. The construction phase applies a modified SLD-resolution scheme, and the decoration phase a partial (multi-pass) traversal over the tree. The phase partition is generated by static analysis where data dependencies are extracted for the functional elements of the program. The method g…
Alle Wege Führen Zum Text
2016
The intention of this chapter is to review the development of linguistic research during the twentieth century in Europe and the United States, in order to show that the genesis of text linguistics as a comprehensive theoretical framework was necessary, considering the events from a post-eventum perspective. Firstly, structural linguistics is presented as well as its main exponents; secondly, generative linguistics is discussed; thirdly, the genesis and the development of text linguistics is presented. Concerning the structural linguistics, the key issues investigated by 4 linguists (namely, de Saussure, Benveniste, Hjelmslev and Bloomfield) are summarized. Concerning the generative linguis…
Revisiting corpus creation and analysis tools for translation tasks
2016
Many translation scholars have proposed the use of corpora to allow professional translators to produce high quality texts which read like originals. Yet, the diffusion of this methodology has been modest, one reason being the fact that software for corpora analyses have been developed with the linguist in mind, which means that they are generally complex and cumbersome, offering many advanced features, but lacking the level of usability and the specific features that meet translators’ needs. To overcome this shortcoming, we have developed TranslatorBank, a free corpus creation and analysis tool designed for translation tasks. TranslatorBank supports the creation of specialized monolingual …
Building Construction Sets by Tiling Grammar Simplification
2016
This paper poses the problem of fabricating physical construction sets from example geometry: A construction set provides a small number of different types of building blocks from which the example model as well as many similar variants can be reassembled. This process is formalized by tiling grammars. Our core contribution is an approach for simplifying tiling grammars such that we obtain physically manufacturable building blocks of controllable granularity while retaining variability, i.e., the ability to construct many different, related shapes. Simplification is performed by sequences of two types of elementary operations: non-local joint edge collapses in the tile graphs reduce the gra…
Learning the structure of HMM's through grammatical inference techniques
2002
A technique is described in which all the components of a hidden Markov model are learnt from training speech data. The structure or topology of the model (i.e. the number of states and the actual transitions) is obtained by means of an error-correcting grammatical inference algorithm (ECGI). This structure is then reduced by using an appropriate state pruning criterion. The statistical parameters that are associated with the obtained topology are estimated from the same training data by means of the standard Baum-Welch algorithm. Experimental results showing the applicability of this technique to speech recognition are presented. >