Search results for "speech recognition"
showing 10 items of 357 documents
Are syllables phonological units in visual word recognition?
2004
A number of studies have shown that syllables play an important role in visual word recognition in Spanish. We report three lexical decision experiments with a masked priming technique that examined whether syllabic effects are phonological or orthographic in nature. In all cases, primes were nonwords. In Experiment 1, latencies to CV words were faster when primes and targets shared the first syllable (ju.nas-JU.NIO) than when they shared the initial letters but not the first syllable (jun.tu-JU.NIO). In Experiment 2, this syllabic overlap could be phonological+orthographical (vi.rel-VI.RUS) or just phonological (bi.rel-VI.RUS). A syllable priming effect was found for CV words in both the…
A case study on feature sensitivity for audio event classification using support vector machines
2016
Automatic recognition of multiple acoustic events is an interesting problem in machine listening that generalizes the classical speech/non-speech or speech/music classification problem. Typical audio streams contain a diversity of sound events that carry important and useful information on the acoustic environment and context. Classification is usually performed by means of hidden Markov models (HMMs) or support vector machines (SVMs) considering traditional sets of features based on Mel-frequency cepstral coefficients (MFCCs) and their temporal derivatives, as well as the energy from auditory-inspired filterbanks. However, while these features are routinely used by many systems, it is not …
Data Augmentation for Pipeline-Based Speech Translation
2020
International audience; Pipeline-based speech translation methods may suffer from errors found in speech recognition system output. Therefore, it is crucial that machine translation systems are trained to be robust against such noise. In this paper, we propose two methods for parallel data augmentation for pipeline-based speech translation system development. The first method utilises a speech processing workflow to introduce errors and the second method generates commonly found suffix errors using a rule-based method. We show that the methods in combination allow significantly improving speech translation quality by 1.87 BLEU points over a baseline system.
Semantic Word Error Rate for Sentence Similarity
2016
Sentence similarity measures have applications in several tasks, including: Machine Translation, Paraphrase Iden- tification, Speech Recognition, Question-answering and Text Summarization. However, measures designed for these tasks are aimed at assessing equivalence rather than resemblance, partly departing from human cognition of similarity. While this is reasonable for these activities, it hinders the applicability of sentence similarity measures to other tasks. We therefore propose a new sentence similarity measure specifically designed for resemblance evaluation, in order to cover these fields better. Experimental results are discussed.
Domain-general neural correlates of dependency formation: Using complex tones to simulate language
2015
There is an ongoing debate whether the P600 event-related potential component following syntactic anomalies reflects syntactic processes per se, or if it is an instance of the P300, a domain-general ERP component associated with attention and cognitive reorientation. A direct comparison of both components is challenging because of the huge discrepancy in experimental designs and stimulus choice between language and ‘classic’ P300 experiments. In the present study, we develop a new approach to mimic the interplay of sequential position as well as categorical and relational information in natural language syntax (word category and agreement) in a non-linguistic target detection paradigm using…
Integration of acoustical information in the perception of impacted sound sources. The role of information accuracy and exploitability.
2010
Sound sources are perceived by integrating information from multiple acoustical features. The factors influencing the integration of information are largely unknown. We measured how the perceptual weighting of different features varies with the accuracy of information and with a listener’s ability to exploit it. Participants judged the hardness of two objects whose interaction generates an impact sound: a hammer and a sounding object. In a first discrimination experiment, trained listeners focused on the most accurate information, although with greater difficulty when perceiving the hammer. We inferred a limited exploitability for the most accurate hammer-hardness information. In a second r…
Does Tonal Information Affect the Early Stages of Visual-Word Processing in Thai?
2014
Thai offers a unique opportunity to investigate the role of lexical tone processing during visual-word recognition, as tone is explicitly expressed in its script. In order to investigate the contribution of tone at the orthographic/phonological level during the early stages of word processing in Thai, we conducted a masked priming experiment—using both lexical decision and word naming tasks. For a given target word (e.g., ห้อง/hᴐ:ŋ2/, room), five priming conditions were created: (a) identity (e.g., ห้อง/hᴐ:ŋ2/), (b) same initial consonant, but with a different tone marker (e.g., ห่อง/hᴐ:ŋ1/), (c) different initial consonant, but with the same tone marker (e.g., ศ้อง/sᴐ:ŋ2/), (d) orthograph…
Does Top-Down Feedback Modulate the Encoding of Orthographic Representations During Visual-Word Recognition?
2016
Abstract. In masked priming lexical decision experiments, there is a matched-case identity advantage for nonwords, but not for words (e.g., ERTAR-ERTAR < ertar-ERTAR; ALTAR-ALTAR = altar-ALTAR). This dissociation has been interpreted in terms of feedback from higher levels of processing during orthographic encoding. Here, we examined whether a matched-case identity advantage also occurs for words when top-down feedback is minimized. We employed a task that taps prelexical orthographic processes: the masked prime same-different task. For “same” trials, results showed faster response times for targets when preceded by a briefly presented matched-case identity prime than when preceded by …
Visual letter similarity effects during sentence reading: Evidence from the boundary technique
2018
The study of how the cognitive system encodes letter identities from the visual input has received much attention in models of visual word recognition but it has typically been overlooked in models of eye movement control in reading. Here we examined how visual letter similarity affects early word processing during reading using Rayner's (1975) boundary change technique in which the parafoveal preview of the target word was either identical (e.g., frito-frito [fried]) or a one-letter-different nonword (e.g., frjto-frito vs. frgto-frito). Critically, the substituted letter in the nonword was visually similar (based on letter confusability norms) or visually dissimilar. Results showed shorter…
Numerical simulation of glottal flow
2012
In cases of permanent immobility of both vocal folds patients have difficulties with breathing but rarely with voicing. However, clinical experience shows that the shape of the larynx (voice box) seems to have a significant influence on the degree of airflow and breathing pattern. In order to find an optimal geometry of the larynx in terms of easiness for breathing after the surgical change of vocal folds or false vocal cords (ventricular folds), a set of numerical simulations of glottal flow for weakly compressible Navier-Stokes equations has been performed. We compare airflow resistance and volumetric flow rate for several geometry concepts for inspiration as well as expiration. Finally, …