Search results for "Speech recognition"
showing 10 items of 357 documents
Exploring Frequency-Dependent Brain Networks from Ongoing EEG Using Spatial ICA During Music Listening
2020
Recently, exploring brain activity based on functional networks during naturalistic stimuli especially music and video represents an attractive challenge because of the low signal-to-noise ratio in collected brain data. Although most efforts focusing on exploring the listening brain have been made through functional magnetic resonance imaging (fMRI), sensor-level electro- or magnetoencephalography (EEG/MEG) technique, little is known about how neural rhythms are involved in the brain network activity under naturalistic stimuli. This study exploited cortical oscillations through analysis of ongoing EEG and musical feature during freely listening to music. We used a data-driven method that co…
Facial geometry and speech analysis for depression detection.
2017
Depression is one of the most prevalent mental disorders, burdening many people world-wide. A system with the potential of serving as a decision support system is proposed, based on novel features extracted from facial expression geometry and speech, by interpreting non-verbal manifestations of depression. The proposed system has been tested both in gender independent and gender based modes, and with different fusion methods. The algorithms were evaluated for several combinations of parameters and classification schemes, on the dataset provided by the Audio/Visual Emotion Challenge of 2013 and 2014. The proposed framework achieved a precision of 94.8% for detecting persons achieving high sc…
Effect of parametric variation of center frequency and bandwidth of morlet wavelet transform on time-frequency analysis of event-related potentials
2017
Time-frequency (TF) analysis of event-related potentials (ERPs) using Complex Morlet Wavelet Transform has been widely applied in cognitive neuroscience research. It has been widely suggested that the center frequency (fc) and bandwidth (σ) should be considered in defining the mother wavelet. However, the issue how parametric variation of fc and σ of Morlet wavelet transform exerts influence on ERPs time-frequency results has not been extensively discussed in previous research. The current study, through adopting the method of Complex Morlet Continuous Wavelet Transform (CMCWT), aims to investigate whether time-frequency results vary with different parametric settings of fc and σ. Besides, …
Filtering of Spontaneous and Low Intensity Emotions in Educational Contexts
2015
Affect detection is a challenging problem, even more in educational contexts, where emotions are spontaneous and usually subtle. In this paper, we propose a two-stage detection approach based on an initial binary discretization followed by a specific emotion prediction stage. The binary classification method uses several distinct sources of information to detect and filter relevant time slots from an affective point of view. An accuracy close to 75% at detecting whether the learner has felt an educationally relevant emotion on 20 second time slots has been obtained. These slots can then be further analyzed by a second classifier, to determine the specific user emotion.
Letter Position Coding Across Modalities: The Case of Braille Readers
2012
BackgroundThe question of how the brain encodes letter position in written words has attracted increasing attention in recent years. A number of models have recently been proposed to accommodate the fact that transposed-letter stimuli like jugde or caniso are perceptually very close to their base words.MethodologyHere we examined how letter position coding is attained in the tactile modality via Braille reading. The idea is that Braille word recognition may provide more serial processing than the visual modality, and this may produce differences in the input coding schemes employed to encode letters in written words. To that end, we conducted a lexical decision experiment with adult Braille…
Vector representation of non-standard spellings using dynamic time warping and a denoising autoencoder
2017
The presence of non-standard spellings in Twitter causes challenges for many natural language processing tasks. Traditional approaches mainly regard the problem as a translation, spell checking, or speech recognition problem. This paper proposes a method that represents the stochastic relationship between words and their non-standard versions in real vectors. The method uses dynamic time warping to preprocess the non-standard spellings and autoencoder to derive the vector representation. The derived vectors encode word patterns and the Euclidean distance between the vectors represents a distance in the word space that challenges the prevailing edit distance. After training the autoencoder o…
On the use of a metric-space search algorithm (AESA) for fast DTW-based recognition of isolated words
1988
The approximating and eliminating search algorithm (AESA) presented was recently introduced for finding nearest neighbors in metric spaces. Although the AESA was originally developed for reducing the time complexity of dynamic time-warping isolated word recognition (DTW-IWR), only rather limited experiments had been previously carried out to check its performance in this task. A set of experiments aimed at filling this gap is reported. The main results show that the important features reflected in previous simulation experiments are also true for real speech samples. With single-speaker dictionaries of up to 200 words, and for most of the different speech parameterizations, local metrics, a…
Electrophysiological evidence for change detection in speech sound patterns by anesthetized rats
2014
Human infants are able to detect changes in grammatical rules in a speech sound stream. Here, we tested whether rats have a comparable ability by using an electrophysiological measure that has been shown to reflect higher order auditory cognition even before it becomes manifested in behavioral level. Urethane-anesthetized rats were presented with a stream of sequences consisting of three pseudowords carried out at a fast pace. Frequently presented “standard” sequences had 16 variants which all had the same structure. They were occasionally replaced by acoustically novel “deviant” sequences of two different types: structurally consistent and inconsistent sequences. Two stimulus conditions we…
Design and Implementation of Deep Learning Based Contactless Authentication System Using Hand Gestures
2021
Hand gestures based sign language digits have several contactless applications. Applications include communication for impaired people, such as elderly and disabled people, health-care applications, automotive user interfaces, and security and surveillance. This work presents the design and implementation of a complete end-to-end deep learning based edge computing system that can verify a user contactlessly using &lsquo
Exploring relationships between audio features and emotion in music
2009
In this paper, we present an analysis of the associations between emotion categories and audio features automatically extracted from raw audio data. This work is based on 110 excerpts from film soundtracks evaluated by 116 listeners. This data is annotated with 5 basic emotions (fear, anger, happiness, sadness, tenderness) on a 7 points scale. Exploiting state-of-the-art Music Information Retrieval (MIR) techniques, we extract audio features of different kind: timbral, rhythmic and tonal. Among others we also compute estimations of dissonance, mode, onset rate and loudness. We study statistical relations between audio descriptors and emotion categories confirming results from psychological …