Search results for "audio signal processing"
showing 8 items of 18 documents
Improving Isolation of Blindly Separated Sources Using Time-Frequency Masking
2008
A refinement technique based on time-frequency masking is proposed to improve source isolation in blind audio source separation algorithms. The refinement technique uses an energy-normalized source-to-interference ratio in order to identify and eliminate interfering energy from the extracted sources. Some examples using this refinement method with different separation algorithms are discussed. The results show that source isolation can be significantly enhanced with negligible degradation of the separated sources.
Real-time Sound Source Localization on Graphics Processing Units
2013
Abstract Sound source localization is an important topic in microphone array signal processing applications, such as camera steering systems, human-machine interaction or surveillance systems. The Steered Response Power with Phase Transform (SRP- PHAT) algorithm is one of the most well-known approaches for sound source localization due to its good performance in noisy and reverberant environments. The algorithm analyzes the sound power captured by a microphone array on a grid of spatial points in a given room. While localization accuracy can be improved by using a high resolution spatial grid and a high number of microphones, performing the localization task in these circumstances requires …
A Comparative Analysis of Residual Block Alternatives for End-to-End Audio Classification
2020
Residual learning is known for being a learning framework that facilitates the training of very deep neural networks. Residual blocks or units are made up of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or shortcut connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers making up a residual block. While residual networks for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, their a…
Self-Organizing Architectures for Digital Signal Processing
2013
Modeling musical attributes to characterize ensemble recordings using rhythmic audio features
2011
In this paper, we present the results of a pre-study on music performance analysis of ensemble music. Our aim is to implement a music classification system for the description of live recordings, for instance to help musicologist and musicians to analyze improvised ensemble performances. The main problem we deal with is the extraction of a suitable set of audio features from the recorded instrument tracks. Our approach is to extract rhythm-related audio features and to apply them for regression-based modeling of eight more general musical attributes. The model based on Partial Least-Squares Regression without preceding Principal Component Analysis performed best for all of the eight attribu…
Emergency Detection with Environment Sound Using Deep Convolutional Neural Networks
2020
In this paper, we propose a generic emergency detection system using only the sound produced in the environment. For this task, we employ multiple audio feature extraction techniques like the mel-frequency cepstral coefficients, gammatone frequency cepstral coefficients, constant Q-transform and chromagram. After feature extraction, a deep convolutional neural network (CNN) is used to classify an audio signal as a potential emergency situation or not. The entire model is based on our previous work that sets the new state of the art in the environment sound classification (ESC) task (Our paper is under review in the IEEE/ACM Transactions on Audio, Speech and Language Processing and also avai…
Musical sound processing in the human brain. Evidence from electric and magnetic recordings.
2001
Recently, our knowledge regarding the brain's ability to represent invariant features of musical information even during the performance of a simultaneous task (unrelated to the sounds) has accumulated rapidly. Recordings of the change-specific mismatch negativity component of event-related brain potentials have shown that temporally and spectrally complex sounds as well as their relations are automatically processed by human auditory cortex. Furthermore, recent magnetoencephalographic and positron emission topographic investigations indicate that this processing differs between phonetic and musical sounds within and between the cerebral hemispheres. These data thus suggest that despite the…
Numerical methods for a nonlinear impact model: A comparative study with closed-form corrections
2011
A physically based impact model-already known and exploited in the field of sound synthesis-is studied using both analytical tools and numerical simulations. It is shown that the Hamiltonian of a physical system composed of a mass impacting on a wall can be expressed analytically as a function of the mass velocity during contact. Moreover, an efficient and accurate approximation for the mass outbound velocity is presented, which allows to estimate the Hamiltonian at the end of the contact. Analytical results are then compared to numerical simulations obtained by discretizing the system with several numerical methods. It is shown that, for some regions of the parameter space, the trajectorie…