Search results for "Word error rate"
showing 10 items of 26 documents
Analyzing Learned Representations of a Deep ASR Performance Prediction Model
2018
This paper addresses a relatively new task: prediction of ASR performance on unseen broadcast programs. In a previous paper, we presented an ASR performance prediction system using CNNs that encode both text (ASR transcript) and speech, in order to predict word error rate. This work is dedicated to the analysis of speech signal embeddings and text embeddings learnt by the CNN while training our prediction model. We try to better understand which information is captured by the deep model and its relation with different conditioning factors. It is shown that hidden layers convey a clear signal about speech style, accent and broadcast type. We then try to leverage these 3 types of information …
Unbiased Estimators and Multilevel Monte Carlo
2018
Multilevel Monte Carlo (MLMC) and unbiased estimators recently proposed by McLeish (Monte Carlo Methods Appl., 2011) and Rhee and Glynn (Oper. Res., 2015) are closely related. This connection is elaborated by presenting a new general class of unbiased estimators, which admits previous debiasing schemes as special cases. New lower variance estimators are proposed, which are stratified versions of earlier unbiased schemes. Under general conditions, essentially when MLMC admits the canonical square root Monte Carlo error rate, the proposed new schemes are shown to be asymptotically as efficient as MLMC, both in terms of variance and cost. The experiments demonstrate that the variance reduction…
Multimodal biometric recognition systems using deep learning based on the finger vein and finger knuckle print fusion
2020
Recognition systems using multimodal biometrics attracts attention because they improve recognition efficiency and high-security level compared to the unimodal biometrics system. In this study, the authors present a secure multimodal biometrics recognition system based on the deep learning method that uses convolutional neural networks (CNNs). The authors propose two multimodal architectures using the finger knuckle print (FKP) and the finger vein (FV) biometrics with different levels of fusion: the features level fusion and scores level fusion. The features extraction for FKP and FV are performed using transfer learning CNN architectures: AlexNet, VGG16, and ResNet50. The key step aims to …
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.
Multi-system machine translation using online APIs for English-Latvian
2015
This paper describes a hybrid machine translation (HMT) system that employs several online MT system application program interfaces (APIs) forming a MultiSystem Machine Translation (MSMT) approach. The goal is to improve the automated translation of English – Latvian texts over each of the individual MT APIs. The selection of the best hypothesis translation is done by calculating the perplexity for each hypothesis. Experiment results show a slight improvement of BLEU score and WER (word error rate).
Spatial noise-aware temperature retrieval from infrared sounder data
2020
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study the compactness and information content of the extracted features. Assessment of the results is done on a big dataset covering many spatial and temporal situations. PCA is widely used for these purposes but our analysis shows that one can gain significant improvements of the error rates when using…
Tests for Differentiation in Gene Expression Using a Data-Driven Order or Weights for Hypotheses
2005
In the analysis of gene expression by microarrays there are usually few subjects, but high-dimensional data. By means of techniques, such as the theory of spherical tests or with suitable permutation tests, it is possible to sort the endpoints or to give weights to them according to specific criteria determined by the data while controlling the multiple type I error rate. The procedures developed so far are based on a sequential analysis of weighted p-values (corresponding to the endpoints), including the most extreme situation of weighting leading to a complete order of p-values. When the data for the endpoints have approximately equal variances, these procedures show good power properties…
Validation procedures in radiological diagnostic models. Neural network and logistic regression
1999
The objective of this paper is to compare the performance of two predictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validatio…
Overdispersion tests in count-data analysis.
2008
Count data are commonly assumed to have a Poisson distribution, especially when there is no diagnostic procedure for checking this assumption. However, count data rarely fit the restrictive assumptions of the Poisson distribution. The violation of much of such assumptions commonly results in overdispersion, which invalidates the Poisson distribution. Undetected overdispersion may entail important misleading inferences, so its detection is essential. In this study, different overdispersion diagnostic tests are evaluated through two simulation studies. In Exp. 1, the nominal error rate is compared under different sample sizes and Λ conditions. Analysis shows a remarkable performance of the χ…
SisHiTra : A Hybrid Machine Translation System from Spanish to Catalan
2004
In the current European scenario, characterized by the coexistence of communities writing and speaking a great variety of languages, machine translation has become a technology of capital importance. In areas of Spain and of other countries, coofficiality of several languages implies producing several versions of public information. Machine translation between all the languages of the Iberian Peninsula and from them into English will allow for a better integration of Iberian linguistic communities among them and inside Europe. The purpose of this paper is to show a machine translation system from Spanish to Catalan that deals with text input. In our approach, both deductive (linguistic) and…