Search results for "Cepstrum"

showing 10 items of 15 documents

Multi-band identification for enhancing bearing fault detection in variable speed conditions

2020

Abstract Rolling element bearings are crucial components in rotating machinery, and avoiding unexpected breakdowns using fault detection methods is an increased demand in industry today. Variable speed conditions render a challenge for vibration-based fault diagnosis due to the non-stationary impact frequency. Computed order tracking transforms the vibration signal from time domain to the shaft-angle domain, allowing order analysis with the envelope spectrum. To enhance fault detection, the bearing resonance frequency region is isolated in the raw signal prior to order tracking. Identification of this region is not trivial but may be estimated using kurtosis-based methods reported in the li…

0209 industrial biotechnologyNoise (signal processing)Computer scienceMechanical EngineeringAerospace EngineeringCondition monitoring02 engineering and technologyFault (power engineering)01 natural sciencesNoise floorFault detection and isolationComputer Science Applications020901 industrial engineering & automationControl and Systems Engineering0103 physical sciencesSignal ProcessingCepstrumTime domain010301 acousticsOrder trackingAlgorithmCivil and Structural EngineeringMechanical Systems and Signal Processing

researchProduct

Image-Evoked Affect and its Impact on Eeg-Based Biometrics

2019

Electroencephalography (EEG) signals provide a representation of the brain’s activity patterns and have been recently exploited for user identification and authentication due to their uniqueness and their robustness to interception and artificial replication. Nevertheless, such signals are commonly affected by the individual’s emotional state. In this work, we examine the use of images as stimulus for acquiring EEG signals and study whether the use of images that evoke similar emotional responses leads to higher identification accuracy compared to images that evoke different emotional responses. Results show that identification accuracy increases when the system is trained with EEG recordin…

021110 strategic defence & security studiesmedicine.diagnostic_testBiometricsComputer scienceSpeech recognition0211 other engineering and technologies02 engineering and technologyElectroencephalographyStimulus (physiology)Statistical classification0202 electrical engineering electronic engineering information engineeringTask analysismedicine020201 artificial intelligence & image processingMel-frequency cepstrum2019 IEEE International Conference on Image Processing (ICIP)

researchProduct

Speech Emotion Recognition method using time-stretching in the Preprocessing Phase and Artificial Neural Network Classifiers

2020

Human emotions are playing a significant role in the understanding of human behaviour. There are multiple ways of recognizing human emotions, and one of them is through human speech. This paper aims to present an approach for designing a Speech Emotion Recognition (SER) system for an industrial training station. While assembling a product, the end user emotions can be monitored and used as a parameter for adapting the training station. The proposed method is using a phase vocoder for time-stretching and an Artificial Neural Network (ANN) for classification of five typical different emotions. As input for the ANN classifier, features like Mel Frequency Cepstral Coefficients (MFCCs), short-te…

Artificial neural networkComputer scienceSpeech recognitionPhase vocoderAudio time-scale/pitch modification020206 networking & telecommunications02 engineering and technologyComputingMethodologies_PATTERNRECOGNITION0202 electrical engineering electronic engineering information engineeringPreprocessor020201 artificial intelligence & image processingMel-frequency cepstrumEmotion recognitionClassifier (UML)Speech rate2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)

researchProduct

ASR performance prediction on unseen broadcast programs using convolutional neural networks

2018

In this paper, we address a relatively new task: prediction of ASR performance on unseen broadcast programs. We first propose an heterogenous French corpus dedicated to this task. Two prediction approaches are compared: a state-of-the-art performance prediction based on regression (engineered features) and a new strategy based on convolutional neural networks (learnt features). We particularly focus on the combination of both textual (ASR transcription) and signal inputs. While the joint use of textual and signal features did not work for the regression baseline, the combination of inputs for CNNs leads to the best WER prediction performance. We also show that our CNN prediction remarkably …

FOS: Computer and information sciencesComputer Science - Computation and LanguageComputer scienceSpeech recognitionFeature extractionInformationSystems_INFORMATIONSTORAGEANDRETRIEVAL02 engineering and technology010501 environmental sciences01 natural sciencesConvolutional neural network[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]Task (project management)[INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL]0202 electrical engineering electronic engineering information engineeringTask analysisPerformance prediction020201 artificial intelligence & image processingMel-frequency cepstrumTranscription (software)Hidden Markov modelComputation and Language (cs.CL)ComputingMilieux_MISCELLANEOUS0105 earth and related environmental sciences

researchProduct

Environment Sound Classification using Multiple Feature Channels and Attention based Deep Convolutional Neural Network

2020

In this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN) with Attention mechanism. The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform (CQT) and Chromagram. Such multiple features have never been used before for signal or audio processing. And, we employ a deeper CNN (DCNN) compared to previous models, consisting of spatially separable convolutions working on time and feature domain separately. Alongside, we use atten…

FOS: Computer and information sciencesComputer Science - Machine LearningSound (cs.SD)Computer science020209 energyMachine Learning (stat.ML)02 engineering and technologycomputer.software_genreConvolutional neural networkComputer Science - SoundDomain (software engineering)Machine Learning (cs.LG)Statistics - Machine LearningAudio and Speech Processing (eess.AS)0202 electrical engineering electronic engineering information engineeringFOS: Electrical engineering electronic engineering information engineeringAudio signal processingVDP::Teknologi: 500::Informasjons- og kommunikasjonsteknologi: 550business.industrySIGNAL (programming language)Pattern recognitionFeature (computer vision)Benchmark (computing)020201 artificial intelligence & image processingArtificial intelligenceMel-frequency cepstrumbusinesscomputerElectrical Engineering and Systems Science - Audio and Speech ProcessingCommunication channel

researchProduct

Low-Power Audio Keyword Spotting using Tsetlin Machines

2021

The emergence of Artificial Intelligence (AI) driven Keyword Spotting (KWS) technologies has revolutionized human to machine interaction. Yet, the challenge of end-to-end energy efficiency, memory footprint and system complexity of current Neural Network (NN) powered AI-KWS pipelines has remained ever present. This paper evaluates KWS utilizing a learning automata powered machine learning algorithm called the Tsetlin Machine (TM). Through significant reduction in parameter requirements and choosing logic over arithmetic based processing, the TM offers new opportunities for low-power KWS while maintaining high learning efficacy. In this paper we explore a TM based keyword spotting (KWS) pipe…

FOS: Computer and information sciencesspeech commandSound (cs.SD)Computer scienceSpeech recognition02 engineering and technologykeyword spottingMachine learningcomputer.software_genreComputer Science - SoundReduction (complexity)Audio and Speech Processing (eess.AS)020204 information systemsFOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringElectrical and Electronic EngineeringArtificial neural networkLearning automatabusiness.industrylearning automatalcsh:Applications of electric power020206 networking & telecommunicationslcsh:TK4001-4102Pipeline (software)Power (physics)machine learningTsetlin MachineMFCCKeyword spottingelectrical_electronic_engineeringScalabilityMemory footprintpervasive AI020201 artificial intelligence & image processingMel-frequency cepstrumArtificial intelligencebusinesscomputerartificial neural networkEfficient energy useElectrical Engineering and Systems Science - Audio and Speech Processing

researchProduct

Audio-video people recognition system for an intelligent environment

2011

In this paper an audio-video system for intelligent environments with the capability to recognize people is presented. Users are tracked inside the environment and their positions and activities can be logged. Users identities are assessed through a multimodal approach by detecting and recognizing voices and faces through the different cameras and microphones installed in the environment. This approach has been chosen in order to create a flexible and cheap but reliable system, implemented using consumer electronics. Voice features are extracted by a short time cepstrum analysis, and face features are extracted using the eigenfaces technique. The recognition task is solved using the same Su…

Face featureSettore ING-INF/05 - Sistemi Di Elaborazione Delle Informazionibusiness.industryComputer scienceIntelligent environmentPeople recognitionFeature extractionReliable systemSet-up phaseSingle sensorFacial recognition systemSelection principleSupport vector machineSoftwareEigenfaceMulti-modal approachMiddlewareCepstrumLearning ruleIntelligent environmentCepstrum analysiComputer visionArtificial intelligenceEigenfacebusiness

researchProduct

A case study on feature sensitivity for audio event classification using support vector machines

2016

Automatic recognition of multiple acoustic events is an interesting problem in machine listening that generalizes the classical speech/non-speech or speech/music classification problem. Typical audio streams contain a diversity of sound events that carry important and useful information on the acoustic environment and context. Classification is usually performed by means of hidden Markov models (HMMs) or support vector machines (SVMs) considering traditional sets of features based on Mel-frequency cepstral coefficients (MFCCs) and their temporal derivatives, as well as the energy from auditory-inspired filterbanks. However, while these features are routinely used by many systems, it is not …

Machine listeningComputer sciencebusiness.industryEvent (computing)Speech recognitionFeature extractionContext (language use)Pattern recognition02 engineering and technologySupport vector machine030507 speech-language pathology & audiology03 medical and health sciencesComputingMethodologies_PATTERNRECOGNITION0202 electrical engineering electronic engineering information engineeringFeature (machine learning)020201 artificial intelligence & image processingArtificial intelligenceMel-frequency cepstrum0305 other medical sciencebusinessHidden Markov model2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)

researchProduct

Single-channel EEG-based subject identification using visual stimuli

2021

Electroencephalography (EEG) signals have been recently proposed as a biometrics modality due to some inherent advantages over traditional biometric approaches. In this work, we studied the performance of individual EEG channels for the task of subject identification in the context of EEG-based biometrics using a recently proposed benchmark dataset that contains EEG recordings acquired under various visual and non-visual stimuli using a low-cost consumer-grade EEG device. Results showed that specific EEG electrodes provide consistently higher identification accuracy regardless of the feature and stimuli types used, while features based on the Mel Frequency Cepstral Coefficients (MFCC) provi…

Modality (human–computer interaction)Biometricsmedicine.diagnostic_testComputer sciencebusiness.industryFeature extractionComputerApplications_COMPUTERSINOTHERSYSTEMSPattern recognitionContext (language use)ElectroencephalographyIdentification (information)ComputingMethodologies_PATTERNRECOGNITIONFeature (computer vision)medicineArtificial intelligenceMel-frequency cepstrumbusiness2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)

researchProduct

MFCC-based Recurrent Neural Network for automatic clinical depression recognition and assessment from speech

2022

Abstract Clinical depression or Major Depressive Disorder (MDD) is a common and serious medical illness. In this paper, a deep Recurrent Neural Network-based framework is presented to detect depression and to predict its severity level from speech. Low-level and high-level audio features are extracted from audio recordings to predict the 24 scores of the Patient Health Questionnaire and the binary class of depression diagnosis. To overcome the problem of the small size of Speech Depression Recognition (SDR) datasets, expanding training labels and transferred features are considered. The proposed approach outperforms the state-of-art approaches on the DAIC-WOZ database with an overall accura…

Modality (human–computer interaction)Mean squared errorComputer scienceSpeech recognitionBiomedical EngineeringHealth Informaticsmedicine.diseaseClass (biology)Patient Health QuestionnaireComputingMethodologies_PATTERNRECOGNITIONRecurrent neural networkSignal ProcessingmedicineMajor depressive disorderMel-frequency cepstrumDepression (differential diagnoses)Biomedical Signal Processing and Control

researchProduct