Search results for "Speech recognition"
showing 10 items of 357 documents
Semantic structures of timbre emerging from social and acoustic descriptions of music
2011
The perceptual attributes of timbre have inspired a considerable amount of multidisciplinary research, but because of the complexity of the phenomena, the approach has traditionally been confined to laboratory conditions, much to the detriment of its ecological validity. In this study, we present a purely bottom-up approach for mapping the concepts that emerge from sound qualities. A social media ( http://www.last.fm ) is used to obtain a wide sample of verbal descriptions of music (in the form of tags) that go beyond the commonly studied concept of genre, and from this the underlying semantic structure of this sample is extracted. The structure that is thereby obtained is then evaluated th…
Combining gestures and vocalizations to imitate sounds
2015
International audience; Communicating about sounds is a difficult task without a technical language, and naïve speakers often rely on different kinds of non-linguistic vocalizations and body gestures (Lemaitre et al. 2014). Previous work has independently studied how effectively people describe sounds with gestures or vocalizations (Caramiaux, 2014, Lemaitre and Rocchesso, 2014). However, speech communication studies suggest a more intimate link between the two processes (Kendon, 2004). Our study thus focused on the combination of manual gestures and non-speech vocalizations in the communication of sounds. We first collected a large database of vocal and gestural imitations of a variety of …
Comparing identification of vocal imitations and computational sketches of everyday sounds
2016
International audience; Sounds are notably difficult to describe. It is thus not surprising that human speakers often use many imitative vocalizations to communicate about sounds. In practice,vocal imitations of non-speech everyday sounds (e.g. the sound of a car passing by) arevery effective: listeners identify sounds better with vocal imitations than with verbal descriptions, despite the fact that vocal imitations are often inaccurate, constrained by the human vocal apparatus. The present study investigated the semantic representations evoked by vocal imitations by experimentally quantifying how well listeners could match sounds to category labels. Itcompared two different types of sounds…
2016
The flexible access to information in working memory is crucial for adaptive behavior. It is assumed that this is realized by switching the focus of attention within working memory. Switching of attention is mirrored in the P3a component of the human event-related brain potential (ERP) and it has been argued that the processes reflected by the P3a are also relevant for selecting information within working memory. The aim of the present study was to further evaluate whether the P3a mirrors genuine switching of attention within working memory by applying an object switching task: Participants updated a memory list of four digits either by replacing one item with another digit or by processing…
2013
Distraction of goal-oriented performance by a sudden change in the auditory environment is an everyday life experience. Different types of changes can be distracting, including a sudden onset of a transient sound and a slight deviation of otherwise regular auditory background stimulation. With regard to deviance detection, it is assumed that slight changes in a continuous sequence of auditory stimuli are detected by a predictive coding mechanisms and it has been demonstrated that this mechanism is capable of distracting ongoing task performance. In contrast, it is open whether transient detection – which does not rely on predictive coding mechanisms – can trigger behavioral distraction, too…
SINGLE-TRIAL BASED INDEPENDENT COMPONENT ANALYSIS ON MISMATCH NEGATIVITY IN CHILDREN
2010
Independent component analysis (ICA) does not follow the superposition rule. This motivates us to study a negative event-related potential — mismatch negativity (MMN) estimated by the single-trial based ICA (sICA) and averaged trace based ICA (aICA), respectively. To sICA, an optimal digital filter (ODF) was used to remove low-frequency noise. As a result, this study demonstrates that the performance of the sICA+ODF and aICA could be different. Moreover, MMN under sICA+ODF fits better with the theoretical expectation, i.e., larger deviant elicits larger MMN peak amplitude.
ERP correlates of transposed-letter priming effects: The role of vowels versus consonants
2008
One key issue for any computational model of visual-word recognition is the choice of an input coding scheme for assigning letter position. Recent research has shown that pseudowords created by transposing two letters are very effective at activating the lexical representation of their base words (e.g., relovution activates REVOLUTION). We report a masked priming lexical decision experiment in which the pseudoword primes were created by transposing/replacing two consonants or two vowels while event-related potentials were recorded. The results showed a modulation of the amplitude at an early window (150-250 ms) and at the N400 component for vowels but not for consonant transpositions. In ad…
Can colours be used to segment words when reading?
2015
Rayner, Fischer, and Pollatsek (1998, Vision Research) demonstrated that reading unspaced text in Indo-European languages produces a substantial reading cost in word identification (as deduced from an increased word-frequency effect on target words embedded in the unspaced vs. spaced sentences) and in eye movement guidance (as deduced from landing sites closer to the beginning of the words in unspaced sentences). However, the addition of spaces between words comes with a cost: nearby words may fall outside high-acuity central vision, thus reducing the potential benefits of parafoveal processing. In the present experiment, we introduced a salient visual cue intended to facilitate the process…
The external frame function in the control of pitch, register, and singing mode: Radiographic observations of a female singer
1999
Summary This study investigates pitch control, register, and singing mode related movements of the laryngo-pharyngeal structures by radiographic methods. One trained female singer served as the subject. The results show that singing voice production involves complex movements in the laryngeal structures. Pitch related increase in the thyro-arytenoid distance (vocal fold length) is nonlinear, slowing down as pitch rises. Similar observations have been made earlier. At the highest pitches, a shortening of the distance can be seen, suggesting the use of alternative pitch control mechanisms. The various observations made support the existence of three registers in this trained female singing vo…
Dissociating spatial and letter-based word length effects observed in readers’ eye movement patterns
2011
In previous eye movement research on word length effects, spatial width has been confounded with the number of letters. McDonald (2006) unconfounded these factors by rendering all words in sentences in constant spatial width. In the present study, the Arial font with proportional letter spacing was used for varying the number of letters while equating for spatial width, while the Courier font with monospaced letter spacing was used to measure the contribution of spatial width to the observed word length effect. Number of letters in words affected single fixation duration on target words, whereas words’ spatial width determined fixation locations in words and the probability of skipping a wo…