0000000000026397

AUTHOR

Francisco Casacuberta

SisHiTra : A Hybrid Machine Translation System from Spanish to Catalan

In the current European scenario, characterized by the coexistence of communities writing and speaking a great variety of languages, machine translation has become a technology of capital importance. In areas of Spain and of other countries, coofficiality of several languages implies producing several versions of public information. Machine translation between all the languages of the Iberian Peninsula and from them into English will allow for a better integration of Iberian linguistic communities among them and inside Europe. The purpose of this paper is to show a machine translation system from Spanish to Catalan that deals with text input. In our approach, both deductive (linguistic) and…

research product

A nonstationary model for the analysis of transient speech signals

In this correspondence, a model is presented for the analysis of transient speech signals, which is based on a sum of the impulsive responses corresponding to a number of poles with time-dependent parameters. The aim of this analysis is to obtain discriminative features of the different transient elements of speech.

research product

Speech-input multi-target machine translation

In order to simultaneously translate speech into multiple languages an extension of stochastic finite-state transducers is proposed. In this approach the speech translation model consists of a single network where acoustic models (in the input) and the multilingual model (in the output) are embedded. The multi-target model has been evaluated in a practical situation, and the results have been compared with those obtained using several mono-target models. Experimental results show that the multi-target one requires less amount of memory. In addition, a single decoding is enough to get the speech translated into multiple languages.

research product

A General Fuzzy-Parsing Scheme for Speech Recognition

In this paper a Speech Recognition Methodology is proposed which is based on the general assumption of ‘fuzzyness’ of both speech-data and knowledge-sources. Besides this general principle, there are other fundamental assumptions which are also the bases of the proposed methodology: ‘Modularity’ in the knowledge organization, ‘Homogeneity’ in the representation of data and knowledge, ‘Passiveness’ of the ‘understanding flow’ (no backtraking or feedback), and ‘Parallelism’ in the recognition activity.

research product

On the use of a metric-space search algorithm (AESA) for fast DTW-based recognition of isolated words

The approximating and eliminating search algorithm (AESA) presented was recently introduced for finding nearest neighbors in metric spaces. Although the AESA was originally developed for reducing the time complexity of dynamic time-warping isolated word recognition (DTW-IWR), only rather limited experiments had been previously carried out to check its performance in this task. A set of experiments aimed at filling this gap is reported. The main results show that the important features reflected in previous simulation experiments are also true for real speech samples. With single-speaker dictionaries of up to 200 words, and for most of the different speech parameterizations, local metrics, a…

research product

An integrated architecture for speech-input multi-target machine translation

The aim of this work is to show the ability of finite-state transducers to simultaneously translate speech into multiple languages. Our proposal deals with an extension of stochastic finite-state transducers that can produce more than one output at the same time. These kind of devices offer great versatility for the integration with other finite-state devices such as acoustic models in order to produce a speech translation system. This proposal has been evaluated in a practical situation, and its results have been compared with those obtained using a standard mono-target speech transducer.

research product

On the metric properties of dynamic time warping

Recently, some new and promising methods have been proposed to reduce the number of Dynamic Time Warping (DTW) computations in Isolated Word Recognition. For these methods to be properly applicable, the verification of the Triangle Inequality (TI) by the DTW-based Dissimilarity Measure utilized seems to be an important prerequisite.

research product

Learning the structure of HMM's through grammatical inference techniques

A technique is described in which all the components of a hidden Markov model are learnt from training speech data. The structure or topology of the model (i.e. the number of states and the actual transitions) is obtained by means of an error-correcting grammatical inference algorithm (ECGI). This structure is then reduced by using an appropriate state pruning criterion. The statistical parameters that are associated with the obtained topology are estimated from the same training data by means of the standard Baum-Welch algorithm. Experimental results showing the applicability of this technique to speech recognition are presented. >

research product