Search results for "Cepstrum"

showing 5 items of 15 documents

On the Robustness of Deep Features for Audio Event Classification in Adverse Environments

2018

Deep features, responses to complex input patterns learned within deep neural networks, have recently shown great performance in image recognition tasks, motivating their use for audio analysis tasks as well. These features provide multiple levels of abstraction which permit to select a sufficiently generalized layer to identify classes not seen during training. The generalization capability of such features is very useful due to the lack of complete labeled audio datasets. However, as opposed to classical hand-crafted features such as Mel-frequency cepstral coefficients (MFCCs), the performance impact of having an acoustically adverse environment has not been evaluated in detail. In this p…

ReverberationNoise measurementComputer scienceSpeech recognitionFeature extraction02 engineering and technologyConvolutional neural network030507 speech-language pathology & audiology03 medical and health sciencesRaw audio formatRobustness (computer science)Audio analyzer0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingMel-frequency cepstrum0305 other medical science2018 14th IEEE International Conference on Signal Processing (ICSP)

researchProduct

Embedded Knowledge-based Speech Detectors for Real-Time Recognition Tasks

2006

Speech recognition has become common in many application domains, from dictation systems for professional practices to vocal user interfaces for people with disabilities or hands-free system control. However, so far the performance of automatic speech recognition (ASR) systems are comparable to human speech recognition (HSR) only under very strict working conditions, and in general much lower. Incorporating acoustic-phonetic knowledge into ASR design has been proven a viable approach to raise ASR accuracy. Manner of articulation attributes such as vowel, stop, fricative, approximant, nasal, and silence are examples of such knowledge. Neural networks have already been used successfully as de…

Settore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniVoice activity detectionArtificial neural networkDictationbusiness.industryComputer scienceSpeech recognitionSpeech technologycomputer.software_genreSpeech processingManner of articulationSilenceVowelComputer ScienceTelecommunicationsMel-frequency cepstrumArtificial intelligencespeech detectorUser interfacebusinesscomputerNatural language processing

researchProduct

Emergency Detection with Environment Sound Using Deep Convolutional Neural Networks

2020

In this paper, we propose a generic emergency detection system using only the sound produced in the environment. For this task, we employ multiple audio feature extraction techniques like the mel-frequency cepstral coefficients, gammatone frequency cepstral coefficients, constant Q-transform and chromagram. After feature extraction, a deep convolutional neural network (CNN) is used to classify an audio signal as a potential emergency situation or not. The entire model is based on our previous work that sets the new state of the art in the environment sound classification (ESC) task (Our paper is under review in the IEEE/ACM Transactions on Audio, Speech and Language Processing and also avai…

Signal processingAudio signalComputer sciencebusiness.industrySpeech recognitionDeep learningFeature extractioncomputer.software_genreConvolutional neural networkBinary classificationMel-frequency cepstrumArtificial intelligenceAudio signal processingbusinesscomputer

researchProduct

Event signal characterization for disturbance interpretation in power grid

2018

This paper presents the signal processing approach to detect and characterize the physical events that occur in power system using PMUs signals. A small window is applied so that the extracted spectral features belong to a stationary signal. This is based on applying empirical mode decomposition, followed by square root of spectral kurtosis (SRSK) for computation of statistical indices to indicate the event occurrence. Subsequently, features from these events are extracted using mel frequency cepstral coefficients on SRSK. © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/re…

Signal processingWaveletStationary processComputer sciencebusiness.industryKurtosisPattern recognitionMel-frequency cepstrumArtificial intelligencebusinessSignalHilbert–Huang transformEvent (probability theory)

researchProduct

EEG-based biometrics: effects of template ageing

2020

This chapter discusses the effects of template ageing in EEG-based biometrics. The chapter also serves as an introduction to general biometrics and its main tasks: Identification and verification. To do so, we investigate different characterisations of EEG signals and examine the difference of performance in subject identification between single session and cross-session identification experiments. In order to do this, EEG signals are characterised with common state-of-the-art features, i.e. Mel Frequency Cepstral Coefficients (MFCC), Autoregression Coefficients, and Power Spectral Density-derived features. The samples were later classified using various classifiers, including Support Vecto…

medicine.diagnostic_testBiometricsComputer sciencebusiness.industryPattern recognitionElectroencephalographySupport vector machineIdentification (information)Autoregressive modelmedicineMel-frequency cepstrumArtificial intelligencebusinessSingle session

researchProduct