Search results for "Speech recognition"

showing 10 items of 357 documents

Exploring Frequency-Dependent Brain Networks from Ongoing EEG Using Spatial ICA During Music Listening

2020

Recently, exploring brain activity based on functional networks during naturalistic stimuli especially music and video represents an attractive challenge because of the low signal-to-noise ratio in collected brain data. Although most efforts focusing on exploring the listening brain have been made through functional magnetic resonance imaging (fMRI), sensor-level electro- or magnetoencephalography (EEG/MEG) technique, little is known about how neural rhythms are involved in the brain network activity under naturalistic stimuli. This study exploited cortical oscillations through analysis of ongoing EEG and musical feature during freely listening to music. We used a data-driven method that co…

DYNAMICS6162 Cognitive scienceBrain activity and meditationComputer scienceSpeech recognitionIndependent components analysisElectroencephalographyACTIVATIONSuperior temporal gyrus0302 clinical medicineMusic information retrievalaivotutkimusEEGindependent components analysisBrain MappingRadiological and Ultrasound Technologymedicine.diagnostic_test05 social sciencesBrainElectroencephalographyhumanitiesEMOTIONSNeurologyFeature (computer vision)Auditory PerceptionALPHA-BANDFrequency-specific networks; Music information retrieval; EEG; Independent components analysisfrequency-specific networksAnatomyaivotTOOLBOX515 PsychologyMusic information retrievalmusic information retrievalmusiikkibehavioral disciplines and activitieskuunteleminen050105 experimental psychologyTIMBRE03 medical and health sciencesOSCILLATIONSmedicineHumans0501 psychology and cognitive sciencesRadiology Nuclear Medicine and imagingPERCEPTIONOriginal PaperATTENTIONtaajuusMagnetoencephalographyaivokuoriFrequency-specific networksNeurology (clinical)Functional magnetic resonance imaginghuman activitiesTimbreMusic030217 neurology & neurosurgeryRESPONSESBrain Topography

researchProduct

Facial geometry and speech analysis for depression detection.

2017

Depression is one of the most prevalent mental disorders, burdening many people world-wide. A system with the potential of serving as a decision support system is proposed, based on novel features extracted from facial expression geometry and speech, by interpreting non-verbal manifestations of depression. The proposed system has been tested both in gender independent and gender based modes, and with different fusion methods. The algorithms were evaluated for several combinations of parameters and classification schemes, on the dataset provided by the Audio/Visual Emotion Challenge of 2013 and 2014. The proposed framework achieved a precision of 94.8% for detecting persons achieving high sc…

Decision support systemFacial expressionDepressive DisorderDepressionSpeech recognition05 social sciencesNearest neighbourClassification scheme02 engineering and technologyFacial geometryBinary operationFace0502 economics and business0202 electrical engineering electronic engineering information engineeringDecision fusionHumansSpeech020201 artificial intelligence & image processingPsychologyClassifier (UML)050203 business & managementAlgorithmsAnnual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference

researchProduct

Effect of parametric variation of center frequency and bandwidth of morlet wavelet transform on time-frequency analysis of event-related potentials

2017

Time-frequency (TF) analysis of event-related potentials (ERPs) using Complex Morlet Wavelet Transform has been widely applied in cognitive neuroscience research. It has been widely suggested that the center frequency (fc) and bandwidth (σ) should be considered in defining the mother wavelet. However, the issue how parametric variation of fc and σ of Morlet wavelet transform exerts influence on ERPs time-frequency results has not been extensively discussed in previous research. The current study, through adopting the method of Complex Morlet Continuous Wavelet Transform (CMCWT), aims to investigate whether time-frequency results vary with different parametric settings of fc and σ. Besides, …

Discrete wavelet transformcomplex morlet wavelet transformbandwidthbusiness.industrySpeech recognitionPattern recognitionevent-related potentialsWavelet packet decompositioncenter frequencyWaveletTime–frequency representationMorlet wavelettime-frequency representationArtificial intelligencebusinessContinuous wavelet transformConstant Q transformMathematicsParametric statistics

researchProduct

Filtering of Spontaneous and Low Intensity Emotions in Educational Contexts

2015

Affect detection is a challenging problem, even more in educational contexts, where emotions are spontaneous and usually subtle. In this paper, we propose a two-stage detection approach based on an initial binary discretization followed by a specific emotion prediction stage. The binary classification method uses several distinct sources of information to detect and filter relevant time slots from an affective point of view. An accuracy close to 75% at detecting whether the learner has felt an educationally relevant emotion on 20 second time slots has been obtained. These slots can then be further analyzed by a second classifier, to determine the specific user emotion.

DiscretizationPoint (typography)Binary classificationComputer scienceSpeech recognitionClassifier (linguistics)Binary numberFilter (signal processing)Affective computingAffect (psychology)

researchProduct

Letter Position Coding Across Modalities: The Case of Braille Readers

2012

BackgroundThe question of how the brain encodes letter position in written words has attracted increasing attention in recent years. A number of models have recently been proposed to accommodate the fact that transposed-letter stimuli like jugde or caniso are perceptually very close to their base words.MethodologyHere we examined how letter position coding is attained in the tactile modality via Braille reading. The idea is that Braille word recognition may provide more serial processing than the visual modality, and this may produce differences in the input coding schemes employed to encode letters in written words. To that end, we conducted a lexical decision experiment with adult Braille…

Dissociation (neuropsychology)Speech recognitionScienceDecision MakingBiologySemanticsSocial and Behavioral SciencesMemoryLexical decision taskPsychophysicsPsychologyHumansMultidisciplinaryModality (human–computer interaction)PsycholinguisticsQRCognitive PsychologyLinguisticsExperimental PsychologyRecognition PsychologyBrailleSemanticsSerial memory processingScience EducationReadingTouchWord recognitionDevelopmental PsychologySensory AidsMedicineSensory PerceptionCoding (social sciences)Research ArticlePLoS ONE

researchProduct

Vector representation of non-standard spellings using dynamic time warping and a denoising autoencoder

2017

The presence of non-standard spellings in Twitter causes challenges for many natural language processing tasks. Traditional approaches mainly regard the problem as a translation, spell checking, or speech recognition problem. This paper proposes a method that represents the stochastic relationship between words and their non-standard versions in real vectors. The method uses dynamic time warping to preprocess the non-standard spellings and autoencoder to derive the vector representation. The derived vectors encode word patterns and the Euclidean distance between the vectors represents a distance in the word space that challenges the prevailing edit distance. After training the autoencoder o…

Dynamic time warpingArtificial neural networkComputer sciencebusiness.industrySpeech recognition020208 electrical & electronic engineeringPattern recognitionContext (language use)02 engineering and technology010501 environmental sciencesTranslation (geometry)01 natural sciencesAutoencoderEuclidean distance0202 electrical engineering electronic engineering information engineeringEdit distanceArtificial intelligenceHidden Markov modelbusinessWord (computer architecture)0105 earth and related environmental sciences2017 IEEE Congress on Evolutionary Computation (CEC)

researchProduct

On the use of a metric-space search algorithm (AESA) for fast DTW-based recognition of isolated words

1988

The approximating and eliminating search algorithm (AESA) presented was recently introduced for finding nearest neighbors in metric spaces. Although the AESA was originally developed for reducing the time complexity of dynamic time-warping isolated word recognition (DTW-IWR), only rather limited experiments had been previously carried out to check its performance in this task. A set of experiments aimed at filling this gap is reported. The main results show that the important features reflected in previous simulation experiments are also true for real speech samples. With single-speaker dictionaries of up to 200 words, and for most of the different speech parameterizations, local metrics, a…

Dynamic time warpingbusiness.industryComputer scienceSpeech recognitionComputationPattern recognitionTask (project management)Set (abstract data type)Metric spaceSearch algorithmSignal ProcessingWord recognitionArtificial intelligencebusinessTime complexityIEEE Transactions on Acoustics, Speech, and Signal Processing

researchProduct

Electrophysiological evidence for change detection in speech sound patterns by anesthetized rats

2014

Human infants are able to detect changes in grammatical rules in a speech sound stream. Here, we tested whether rats have a comparable ability by using an electrophysiological measure that has been shown to reflect higher order auditory cognition even before it becomes manifested in behavioral level. Urethane-anesthetized rats were presented with a stream of sequences consisting of three pseudowords carried out at a fast pace. Frequently presented “standard” sequences had 16 variants which all had the same structure. They were occasionally replaced by acoustically novel “deviant” sequences of two different types: structurally consistent and inconsistent sequences. Two stimulus conditions we…

EXTRACTIONCORTEX515 PsychologySpeech recognitionspeecheducationMismatch negativityINTELLIGENCELocal field potentialStimulus (physiology)Auditory cortexbehavioral disciplines and activitieslcsh:RC321-571MECHANISMSlocal-field potentialsmedicinePsychologyauditory cortexratOriginal Research ArticleCOTTON-TOP TAMARINSlcsh:Neurosciences. Biological psychiatry. Neuropsychiatryta515pattern perceptionGeneral NeuroscienceNoveltyCognitionHuman brainElectrophysiologymedicine.anatomical_structureDISCRIMINATIONSTREAMmismatch negativityMONKEYSpoikkeavuusnegatiivisuusPsychologyNeuroscienceRULE

researchProduct

Design and Implementation of Deep Learning Based Contactless Authentication System Using Hand Gestures

2021

Hand gestures based sign language digits have several contactless applications. Applications include communication for impaired people, such as elderly and disabled people, health-care applications, automotive user interfaces, and security and surveillance. This work presents the design and implementation of a complete end-to-end deep learning based edge computing system that can verify a user contactlessly using &lsquo

Edge deviceComputer Networks and CommunicationsComputer scienceSpeech recognitionlcsh:TK7800-8360securitySign languageVDP::Teknologi: 500::Elektrotekniske fag: 540edge computingCode (cryptography)ComputerSystemsOrganization_SPECIAL-PURPOSEANDAPPLICATION-BASEDSYSTEMSElectrical and Electronic EngineeringEdge computingAuthenticationhand gestures recognitionArtificial neural networkbusiness.industryDeep learninglcsh:Electronicsdeep learningneural networkscontactless authenticationHardware and ArchitectureControl and Systems Engineeringcamera based authenticationSignal ProcessingArtificial intelligencebusinessGesture

researchProduct

Exploring relationships between audio features and emotion in music

2009

In this paper, we present an analysis of the associations between emotion categories and audio features automatically extracted from raw audio data. This work is based on 110 excerpts from film soundtracks evaluated by 116 listeners. This data is annotated with 5 basic emotions (fear, anger, happiness, sadness, tenderness) on a 7 points scale. Exploiting state-of-the-art Music Information Retrieval (MIR) techniques, we extract audio features of different kind: timbral, rhythmic and tonal. Among others we also compute estimations of dissonance, mode, onset rate and loudness. We study statistical relations between audio descriptors and emotion categories confirming results from psychological …

Emotion classificationmedia_common.quotation_subjectSpeech recognitionAngerLoudnessSadnessBehavioral NeurosciencePsychiatry and Mental healthRaw audio formatMode (music)Neuropsychology and Physiological PsychologyNeurologyHappinessMusic information retrievalPsychologyBiological Psychiatrymedia_commonFrontiers in Human Neuroscience

researchProduct