Search results for "FOS: Electrical engineering"
showing 10 items of 127 documents
Extending the Unmixing methods to Multispectral Images
2021
In the past few decades, there has been intensive research concerning the Unmixing of hyperspectral images. Some methods such as NMF, VCA, and N-FINDR have become standards since they show robustness in dealing with the unmixing of hyperspectral images. However, the research concerning the unmixing of multispectral images is relatively scarce. Thus, we extend some unmixing methods to the multispectral images. In this paper, we have created two simulated multispectral datasets from two hyperspectral datasets whose ground truths are given. Then we apply the unmixing methods (VCA, NMF, N-FINDR) to these two datasets. By comparing and analyzing the results, we have been able to demonstrate some…
Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs
2020
Sound Event Localization and Detection (SELD) is a problem related to the field of machine listening whose objective is to recognize individual sound events, detect their temporal activity, and estimate their spatial location. Thanks to the emergence of more hard-labeled audio datasets, deep learning techniques have become state-of-the-art solutions. The most common ones are those that implement a convolutional recurrent network (CRNN) having previously transformed the audio signal into multichannel 2D representation. The squeeze-excitation technique can be considered as a convolution enhancement that aims to learn spatial and channel feature maps independently rather than together as stand…
Tag2Risk: Harnessing social music tags for characterizing depression risk
2020
Musical preferences have been considered a mirror of the self. In this age of Big Data, online music streaming services allow us to capture ecologically valid music listening behavior and provide a rich source of information to identify several user-specific aspects. Studies have shown musical engagement to be an indirect representation of internal states including internalized symptomatology and depression. The current study aims at unearthing patterns and trends in the individuals at risk for depression as it manifests in naturally occurring music listening behavior. Mental well-being scores, musical engagement measures, and listening histories of Last.fm users (N=541) were acquired. Soci…
On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification
2019
Residual learning is a recently proposed learning framework to facilitate the training of very deep neural networks. Residual blocks or units are made of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or residual connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers that make up a residual block. While ResNet architectures for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, few w…
Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation
2020
Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes. This problem can be solved with an unsupervised ASD solution since industrial machines will not be damaged simply by having this audio data in the training stage. This paper proposes a novel framework based on convolutional autoencoders (both unsupervised and semi-supervised) and a Gammatone-base…
CNN depth analysis with different channel inputs for Acoustic Scene Classification
2019
Acoustic scene classification (ASC) has been approached in the last years using deep learning techniques such as convolutional neural networks or recurrent neural networks. Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks. Finding the most suitable audio representation is still a research area of interest. In this paper, different log-Mel representations and combinations are analyzed. Experiments show that the best results are obtained using the harmonic and percussive components plus the difference between left and right stereo channels, (L-R). On the other hand, it i…
Acoustic Scene Classification with Squeeze-Excitation Residual Networks
2020
Acoustic scene classification (ASC) is a problem related to the field of machine listening whose objective is to classify/tag an audio clip in a predefined label describing a scene location (e. g. park, airport, etc.). Many state-of-the-art solutions to ASC incorporate data augmentation techniques and model ensembles. However, considerable improvements can also be achieved only by modifying the architecture of convolutional neural networks (CNNs). In this work we propose two novel squeeze-excitation blocks to improve the accuracy of a CNN-based ASC framework based on residual learning. The main idea of squeeze-excitation blocks is to learn spatial and channel-wise feature maps independently…
A quantum vocal theory of sound
2020
Concepts and formalism from acoustics are often used to exemplify quantum mechanics. Conversely, quantum mechanics could be used to achieve a new perspective on acoustics, as shown by Gabor studies. Here, we focus in particular on the study of human voice, considered as a probe to investigate the world of sounds. We present a theoretical framework that is based on observables of vocal production, and on some measurement apparati that can be used both for analysis and synthesis. In analogy to the description of spin states of a particle, the quantum-mechanical formalism is used to describe the relations between the fundamental states associated with phonetic labels such as phonation, turbule…
Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks
2020
The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutio…
Comparative survey of visual object classifiers
2018
Classification of Visual Object Classes represents one of the most elaborated areas of interest in Computer Vision. It is always challenging to get one specific detector, descriptor or classifier that provides the expected object classification result. Consequently, it critical to compare the different detection, descriptor and classifier methods available and chose a single or combination of two or three to get an optimal result. In this paper, we have presented a comparative survey of different feature descriptors and classifiers. From feature descriptors, SIFT (Sparse & Dense) and HeuSIFT combination colour descriptors; From classification techniques, Support Vector Classifier, K-Nea…