Search results for "speech"
showing 10 items of 1281 documents
Algorithmic Aspects of Speech Recognition: A Synopsis
2000
Speech recognition is an area with a sizable literature, but there is little discussion of the topic within the computer science algorithms community. Since many of the problems arising in speech recognition are well suited for algorithmic studies, we present them in terms familiar to algorithm designers. Such cross fertilization can breed fresh insights from new perspectives. This material is abstracted from A. L. Buchsbaum and R. Giancarlo, Algorithmic Aspects of Speech Recognition: An Introduction, ACM Journal of Experimental Algorithmics, Vol. 2, 1997, http://www.jea.acm.org.
What's the difference? comparing humans and machines on the Aurora 2 speech recognition task
2013
Processing Continuous Speech in Infancy
2016
The present chapter focuses on fluent speech segmentation abilities in early language development. We first review studies exploring the early use of major prosodic boundary cues which allow infants to cut full utterances into smaller-sized sequences like clauses or phrases. We then summarize studies showing that word segmentation abilities emerge around 8 months, and rely on infants’ processing of various bottom-up word boundary cues and top-down known word recognition cues. Given that most of these cues are specific to the language infants are acquiring, we emphasize how the development of these abilities varies cross-linguistically, and explore their developmental origin. In particular, …
Does training in syllable recognition improve reading speed? A computer-based trial with poor readers from second and third grade.
2013
Repeated reading of infrequent syllables has been shown to increase reading speed at the word level in a transparent orthography. This study confirms these results with a computer-based training method and extends them by comparing the training effects of short syllables and long frequent and infrequent syllables, controlling for rapid automatized naming. Our results, based on a sample of 150 poor readers of Finnish, showed clear gains in reading speed regarding all trained syllables, but a transfer effect to the word level was evident only in the case of long infrequent syllables. Rapid automatized naming was associated with initial reading speed, but not with the training effect. peerRevi…
Tempo Induction from Music Recordings Using Ensemble Empirical Mode Decomposition Analysis
2011
Tempo and beat are among the most important features of Western music. Owing to the perceptual nature of tempo, its automatic analysis and extraction remains a difficult task for a large variety of music genres. Western music notation represents musical events using a hierarchical metrical structure distinguishing different time scales. This hierarchy is often modeled using three levels: the tatum, the tactus, and the measure. The tatum represents the shortest durational value in music that is not just an accidental phenomenon (Bilmes 1993). The tactus period is the most perceptually prominent period, and is the period at which most humans would tap their feet in time with the music (Lerdah…
Automatic fitting of cochlear implants with evolutionary algorithms
2004
This paper presents an optimisation algorithm designed to perform in-situ automatic fitting of cochlear implants.All patients are different, which means that cochlear parametrisation is a difficult and long task, with results ranging from perfect blind speech recognition to patients who cannot make anything out of their implant and just turn it off.The proposed method combines evolutionary algorithms and medical expertise to achieve autonomous interactive fitting through a Personal Digital Assistant (PDA).
Non-speech voice for sonic interaction: a catalogue
2016
This paper surveys the uses of non-speech voice as an interaction modality within sonic applications. Three main contexts of use have been identified: sound retrieval, sound synthesis and control, and sound design. An overview of different choices and techniques regarding the style of interaction, the selection of vocal features and their mapping to sound features or controls is here displayed. A comprehensive collection of examples instantiates the use of non-speech voice in actual tools for sonic interaction. It is pointed out that while voice-based techniques are already being used proficiently in sound retrieval and sound synthesis, their use in sound design is still at an exploratory p…
ERP qualification exploiting waveform, spectral and time-frequency infomax
2008
The present contribution briefly introduces an event related potential (ERP) detector. The specified detector includes three kinds of features of ERP. They are the ERP waveform feature, ERP spectral feature and ERP time-frequency feature respectively. According to these characteristics, two parameters are defined to reflect the timing feature of ERP. The mismatch negativity (MMN) is taken as the example to design an exact qualification detector. The experiment validates that the computer can automatically detect the raw trace to reflect the quality of the dataset, qualify the filtered trace to test whether the artifacts have been filtered out, and select the ERP-like component to reject art…
Analyse des Visuellen Klassifikationssystems Durch Detektionsexperimente
1977
Summary Experiments on recognizing statistically distorted patterns show that the human visual system operates as a linear classifier. The spatial frequency range, within which features are extracted, is determined by the coupling in the area of sharpest vision (2°). The relevant features for classifying patterns are not produced by isotropic filtering
A Sub-Symbolic Approach to Word Modelling for Domain Specific Speech Recognition
2006
In this work a sub-symbolic technique for automatic, data driven language models construction is presented. Such a technique can be used to arrange a language-modelling module, which can be easily integrated in existing speech recognition architectures, such as the well-found HTK architecture. The proposed technique takes advantages from both the traditional LSA approach and from a novel application of a probability space metric known as "Hellinger's distance". Experimental trials are also presented, in order to validate the proposed approach.