Search results for "speech recognition"
showing 10 items of 357 documents
Multi-subject fMRI analysis via combined independent component analysis and shift-invariant canonical polyadic decomposition
2014
Canonical polyadic decomposition (CPD) may face a local optimal problem when analyzing multi-subject fMRI data with inter-subject variability. Beckmann and Smith proposed a tensor PICA approach that incorporated an independence constraint to the spatial modality by combining CPD with ICA, and alleviated the problem of inter-subject spatial map (SM) variability.This study extends tensor PICA to incorporate additional inter-subject time course (TC) variability and to connect CPD and ICA in a new way. Assuming multiple subjects share common TCs but with different time delays, we accommodate subject-dependent TC delays into the CP model based on the idea of shift-invariant CP (SCP). We use ICA …
Extraction of the mismatch negativity elicited by sound duration decrements: A comparison of three procedures
2009
This study focuses on comparison of procedures for extracting the brain event-related potentials (ERPs) - brain responses to stimuli recorded using electroencephalography (EEG). These responses are used to study how the synchronization of brain electrical responses is associated with cognition such as how the brain detects changes in the auditory world. One such event-related response to auditory change is called mismatch negativity (MMN). It is typically observed by computing a difference wave between ERPs elicited by a frequently repeated sound and ERPs elicited by an infrequently occurring sound which differs from the repeated sounds. Fast and reliable extraction of the ERPs, such as the…
Compensating for instantaneous signal mixing in transfer entropy analysis of neurobiological time series
2013
The transfer entropy (TE) has recently emerged as a nonlinear model-free tool, framed in information theory, to detect directed interactions in coupled processes. Unfortunately, when applied to neurobiological time series TE is biased by signal cross-talk due to volume conduction. To compensate for this bias, in this study we introduce a modified TE measure which accounts for possible instantaneous effects between the analyzed time series. The new measure, denoted as compensated TE (cTE), is tested on simulated time series reproducing conditions typical of neuroscience applications, and on real magnetoencephalographic (MEG) multi-trial data measured during a visuo-tactile cognitive experime…
Rigotrio At Semeval-2017 Task 9: Combining Machine Learning And Grammar Engineering For Amr Parsing And Generation
2017
By addressing both text-to-AMR parsing and AMR-to-text generation, SemEval-2017 Task 9 established AMR as a powerful semantic interlingua. We strengthen the interlingual aspect of AMR by applying the multilingual Grammatical Framework (GF) for AMR-to-text generation. Our current rule-based GF approach completely covered only 12.3% of the test AMRs, therefore we combined it with state-of-the-art JAMR Generator to see if the combination increases or decreases the overall performance. The combined system achieved the automatic BLEU score of 18.82 and the human Trueskill score of 107.2, to be compared to the plain JAMR Generator results. As for AMR parsing, we added NER extensions to our SemEva…
The resemblance of an autocorrelation function to a power spectrum density for a spike train of an auditory model
2013
In this work we develop an analytical approach for calculation of the all-order interspike interval density (AOISID), show its connection with the autocorrelation function, and try to explain the discovered resemblance of AOISID to the power spectrum of the same spike train.
How to validate similarity in linear transform models of event-related potentials between experimental conditions?
2014
Abstract Background It is well-known that data of event-related potentials (ERPs) conform to the linear transform model (LTM). For group-level ERP data processing using principal/independent component analysis (PCA/ICA), ERP data of different experimental conditions and different participants are often concatenated. It is theoretically assumed that different experimental conditions and different participants possess the same LTM. However, how to validate the assumption has been seldom reported in terms of signal processing methods. New method When ICA decomposition is globally optimized for ERP data of one stimulus, we gain the ratio between two coefficients mapping a source in brain to two…
Head movements in Finnish Sign Language on the basis of Motion Capture data
2015
This paper reports a study of the forms and functions of head movements produced in the dimension of depth in Finnish Sign Language (FinSL). Specifically, the paper describes and analyzes the phonetic forms and prosodic, grammatical, communicative, and textual functions of nods, head thrusts, nodding, and head pulls occurring in FinSL data consisting of a continuous dialogue recorded with motion capture technology. The analysis yields a novel classification of the kinematic characteristics and functional properties of the four types of head movement. However, it also reveals that there is no perfect correspondence between form and function in the head movements investigated.
Cognitive factors in the evaluation of synthetic speech
1998
Abstract This paper illustrates the importance of various cognitive factors involved in perceiving and comprehending synthetic speech. It includes findings drawn from the relative psychological and psycholinguistic literature together with experimental results obtained at the Fondazione Ugo Bordoni laboratory. Overall, it is shown that listening to and comprehending synthetic voices is more difficult than with a natural voice. However, and more importantly, this difficulty can and does decrease with the subjects' exposure to said synthetic voices. Furthermore, greater workload demands are associated with synthetic speech and subjects listening to synthetic passages are required to pay more …
Modeling Listeners’ Emotional Response to Music
2012
An overview of the computational prediction of emotional responses to music is presented. Communication of emotions by music has received a great deal of attention during the last years and a large number of empirical studies have described the role of individual features (tempo, mode, articulation, timbre) in predicting the emotions suggested or invoked by the music. However, unlike the present work, relatively few studies have attempted to model continua of expressed emotions using a variety of musical features from audio-based representations in a correlation design. The construction of the computational model is divided into four separate phases, with a different focus for evaluation. T…
Breaking down the word length effect on readers’ eye movements
2015
Previous research on the effect of word length on reading confounded the number of letters (NrL) in a word with its spatial width. Consequently, the extent to which visuospatial and attentional-linguistic processes contribute to the word length effect on parafoveal and foveal vision in reading and dyslexia is unknown. Scholars recently suggested that visual crowding is an important factor for determining an individual’s reading speed in fluent and dyslexic reading. We studied whether the NrL or the spatial width of target words affects fixation duration and saccadic measures in natural reading in fluent and dysfluent readers of a transparent orthography. Participants read natural sentences …