0000000000266745

AUTHOR

Maximo Cobos

0000-0001-7318-3192

showing 51 related works from this author

Real-time Sound Source Localization on an Embedded GPU Using a Spherical Microphone Array

2015

Abstract Spherical microphone arrays are becoming increasingly important in acoustic signal processing systems for their applications in sound field analysis, beamforming, spatial audio, etc. The positioning of target and interfering sound sources is a crucial step in many of the above applications. Therefore, 3D sound source localization is a highly relevant topic in the acoustic signal processing field. However, spherical microphone arrays are usually composed of many microphones and running signal processing localization methods in real time is an important issue. Some works have already shown the potential of Graphic Processing Units (GPUs) for developing high-end real-time signal proce…

BeamformingSignal processingMicrophone arraybusiness.industryMicrophoneComputer scienceEmbedded systemsAudio processingAcoustic source localizationMicrophone arraysField (computer science)Sound source localizationEmbedded systemGeneral Earth and Planetary SciencesbusinessComputer hardwareGeneral Environmental ScienceProcedia Computer Science
researchProduct

SART3D: A MATLAB toolbox for spatial audio and signal processing education

2019

Signal processingGeneral Computer ScienceComputer scienceComputer graphics (images)General EngineeringMatlab toolboxEducationComputer Applications in Engineering Education
researchProduct

On the performance of multi-GPU-based expert systems for acoustic localization involving massive microphone arrays

2015

Sound source localization is an important topic in expert systems involving microphone arrays, such as automatic camera steering systems, human-machine interaction, video gaming or audio surveillance. The Steered Response Power with Phase Transform (SRP-PHAT) algorithm is a well-known approach for sound source localization due to its robust performance in noisy and reverberant environments. This algorithm analyzes the sound power captured by an acoustic beamformer on a defined spatial grid, estimating the source location as the point that maximizes the output power. Since localization accuracy can be improved by using high-resolution spatial grids and a high number of microphones, accurate …

Signal processingReverberationComputer scienceMicrophoneReal-time computingGeneral EngineeringAcoustic source localizationSound powercomputer.software_genreGridExpert systemMicrophone arraysComputer Science ApplicationsSound source localizationNoiseArtificial IntelligenceTEORIA DE LA SEÑAL Y COMUNICACIONESCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALGraphics Processing UnitscomputerSteered Response Power
researchProduct

A Comparative Analysis of Residual Block Alternatives for End-to-End Audio Classification

2020

Residual learning is known for being a learning framework that facilitates the training of very deep neural networks. Residual blocks or units are made up of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or shortcut connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers making up a residual block. While residual networks for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, their a…

Normalization (statistics)General Computer ScienceComputer scienceFeature extractionESC02 engineering and technologycomputer.software_genreResidualConvolutional neural networkconvolutional neural networks0202 electrical engineering electronic engineering information engineeringGeneral Materials Scienceurbansound8kAudio signal processingBlock (data storage)Contextual image classificationGeneral EngineeringAudio classification020206 networking & telecommunications113 Computer and information sciences020201 artificial intelligence & image processinglcsh:Electrical engineering. Electronics. Nuclear engineeringData mininglcsh:TK1-9971computerresidual learningIEEE Access
researchProduct

On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification

2019

Residual learning is a recently proposed learning framework to facilitate the training of very deep neural networks. Residual blocks or units are made of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or residual connections. However, multiple implementation alternatives arise with respect to where such skip connections are applied within the set of stacked layers that make up a residual block. While ResNet architectures for image classification using convolutional neural networks (CNNs) have been widely discussed in the literature, few w…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Low-complexity AoA and AoD Estimation in the Transformed Spatial Domain for Millimeter Wave MIMO Channels

2021

High-accuracy angle of arrival (AoA) and angle of departure (AoD) estimation is critical for cell search, stable communications and positioning in millimeter wave (mmWave) cellular systems. Moreover, the design of low-complexity AoA/AoD estimation algorithms is also of major importance in the deployment of practical systems to enable a fast and resource-efficient computation of beamforming weights. Parametric mmWave channel estimation allows to describe the channel matrix as a combination of direction-dependent signal paths, exploiting the sparse characteristics of mmWave channels. In this context, a fast Transformed Spatial Domain Channel Estimation (TSDCE) algorithm was recently proposed …

BeamformingComputer scienceAngle of arrivalFrequency domainComputationContext (language use)AlgorithmSparse matrixParametric statisticsCommunication channel2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)
researchProduct

Simultaneous ranging and self-positioning in unsynchronized wireless acoustic sensor networks

2016

Automatic ranging and self-positioning is a very desirable property in wireless acoustic sensor networks, where nodes have at least one microphone and one loudspeaker. However, due to environmental noise, interference, and multipath effects, audio-based ranging is a challenging task. This paper presents a fast ranging and positioning strategy that makes use of the correlation properties of pseudonoise sequences for estimating simultaneously relative time-of-arrivals from multiple acoustic nodes. To this end, a proper test signal design adapted to the acoustic node transducers is proposed. In addition, a novel self-interference reduction method and a peak matching algorithm are introduced, a…

Wireless acoustic sensor networksMicrophoneComputer scienceReal-time computing02 engineering and technologyInterference (wave propagation)SynchronizationlocalizationTECNOLOGIA ELECTRONICA0202 electrical engineering electronic engineering information engineeringWirelessElectrical and Electronic Engineeringwireless acoustic sensor networksbusiness.industryRangingNode (networking)020208 electrical & electronic engineeringpseudo-noise seequences020206 networking & telecommunicationsRangingTransducerComputer Science::SoundEmbedded systemLocalizationSignal ProcessingLoudspeakerbusinessPseudo-noise seequencesMultipath propagation
researchProduct

Stereo to Wave-Field Synthesis music up-mixing: An objective and subjective evaluation

2008

Sound source separation techniques are known to be very useful in many applications. High fidelity and audio oriented applications are a challenging issue in this topic, however, existing algorithms are far from performing with such a high quality. In this paper, a subjective and objective evaluation are carried out for several algorithms designed for dealing with stereo music mixtures. The performance of these algorithms applied to acoustic scene resynthesis in a Wave Field Synthesis system is discussed.

EngineeringWave field synthesisbusiness.industrySpeech recognitionSound source separationmedia_common.quotation_subjectField (computer science)High fidelitySource separationQuality (business)Computer visionArtificial intelligenceObjective evaluationbusinessMixing (physics)media_common2008 3rd International Symposium on Communications, Control and Signal Processing
researchProduct

A Wireless Acoustic Array System for Binaural Loudness Evaluation in Cities

2017

Networks of acoustic sensors are being deployed in smart cities to continuously monitor noise levels. In this paper, a novel acoustic sensor device is designed for binaural loudness evaluation, in a standalone platform. The audio is acquired from an array of microphones and a binaural signal is synthesized by a direction-of-arrival algorithm and a head-related transfer function. Hardware setup and software algorithms are presented and the results are discussed. Finally, the tests conducted in an early deployment show the feasibility of using the device to carry out large temporal and spatial sampling for the evaluation of binaural loudness.

Engineeringbusiness.industryAcoustics010401 analytical chemistry020206 networking & telecommunications02 engineering and technology01 natural sciencesSignal0104 chemical sciencesTime–frequency analysisLoudnessNoiseSoftwareSampling (signal processing)0202 electrical engineering electronic engineering information engineeringWirelessElectrical and Electronic EngineeringbusinessInstrumentationBinaural recordingIEEE Sensors Journal
researchProduct

Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation

2021

Nonnegative matrix factorization (NMF) has been traditionally considered a promising approach for audio source separation. While standard NMF is only suited for single-channel mixtures, extensions to consider multi-channel data have been also proposed. Among the most popular alternatives, multichannel NMF (MNMF) and further derivations based on constrained spatial covariance models have been successfully employed to separate multi-microphone convolutive mixtures. This letter proposes a MNMF extension by considering a mixture model with Ray-Space-transformed signals, where magnitude data successfully encodes source locations as frequency-independent linear patterns. We show that the MNMF alg…

Covariance functionComputer scienceApplied Mathematics020206 networking & telecommunications02 engineering and technologyExtension (predicate logic)Mixture modelMatrix decompositionNon-negative matrix factorizationTime–frequency analysisblind source separationSignal Processing0202 electrical engineering electronic engineering information engineeringSource separationNon -negative matrix factorization (NMF)array signal processingElectrical and Electronic EngineeringAlgorithmIEEE Signal Processing Letters
researchProduct

Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs

2020

Sound Event Localization and Detection (SELD) is a problem related to the field of machine listening whose objective is to recognize individual sound events, detect their temporal activity, and estimate their spatial location. Thanks to the emergence of more hard-labeled audio datasets, deep learning techniques have become state-of-the-art solutions. The most common ones are those that implement a convolutional recurrent network (CRNN) having previously transformed the audio signal into multichannel 2D representation. The squeeze-excitation technique can be considered as a convolution enhancement that aims to learn spatial and channel feature maps independently rather than together as stand…

FOS: Computer and information sciencesSound (cs.SD)Audio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments

2020

The problem of training with a small set of positive samples is known as few-shot learning (FSL). It is widely known that traditional deep learning (DL) algorithms usually show very good performance when trained with large datasets. However, in many applications, it is not possible to obtain such a high number of samples. In the image domain, typical FSL applications include those related to face recognition. In the audio domain, music fraud or speaker recognition can be clearly benefited from FSL methods. This paper deals with the application of FSL to the detection of specific and intentional acoustic events given by different types of sound alarms, such as door bells or fire alarms, usin…

FOS: Computer and information sciencesComputer Science - Machine LearningSound (cs.SD)sound processingaudio datasetmachine listeningUNESCO::CIENCIAS TECNOLÓGICASComputer Science - SoundMachine Learning (cs.LG)classificationArtificial IntelligenceAudio and Speech Processing (eess.AS)Signal ProcessingFOS: Electrical engineering electronic engineering information engineeringfew-shot learningopen-set recognitionComputer Vision and Pattern RecognitionSoftwareElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach

2020

The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, where the SRP functional can be written as an accumulation of the GCCs computed from multiple sensor pairs. Unfortunately, the accuracy of TDOA estimates is affected by multiple factors, including noise, reverberation and signal bandwidth. In this paper, a sub-band approach for time del…

Reverberationweighted SVDAcoustics and UltrasonicsCross-correlationComputer scienceNoise (signal processing)SRP-PHATMatrix representationTime delay estimationMultilaterationComputational Mathematicssub-band processingAudio and Speech Processing (eess.AS)Temporal resolutionSingular value decompositionComputer Science (miscellaneous)FOS: Electrical engineering electronic engineering information engineeringGCCElectrical and Electronic EngineeringRepresentation (mathematics)SVDAlgorithmElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

An Efficient Implementation of Parallel Parametric HRTF Models for Binaural Sound Synthesis in Mobile Multimedia

2020

The extended use of mobile multimedia devices in applications like gaming, 3D video and audio reproduction, immersive teleconferencing, or virtual and augmented reality, is demanding efficient algorithms and methodologies. All these applications require real-time spatial audio engines with the capability of dealing with intensive signal processing operations while facing a number of constraints related to computational cost, latency and energy consumption. Most mobile multimedia devices include a Graphics Processing Unit (GPU) that is primarily used to accelerate video processing tasks, providing high computational capabilities due to its inherent parallel architecture. This paper describes…

interpolation.General Computer Scienceparallel filtersComputer scienceGPUGpuGraphics processing unitLatency (audio)Parametric model02 engineering and technologycomputer.software_genre030507 speech-language pathology & audiology03 medical and health sciencesSoftware portabilityHRTF modeling0202 electrical engineering electronic engineering information engineeringGeneral Materials ScienceMultimediaparametric modelGeneral EngineeringTeleconferenceBinaural synthesis020206 networking & telecommunicationsVideo processingEnergy consumptioninterpolationInterpolationHrtf modelingScalabilityParallel filtersElectrónicaAugmented realitylcsh:Electrical engineering. Electronics. Nuclear engineering0305 other medical sciencelcsh:TK1-9971Mobile devicecomputerIEEE Access
researchProduct

Speech Intelligibility Analysis and Approximation to Room Parameters through the Internet of Things

2021

In recent years, Wireless Acoustic Sensor Networks (WASN) have been widely applied to different acoustic fields in outdoor and indoor environments. Most of these applications are oriented to locate or identify sources and measure specific features of the environment involved. In this paper, we study the application of a WASN for room acoustic measurements. To evaluate the acoustic characteristics, a set of Raspberry Pi 3 (RPi) has been used. One is used to play different acoustic signals and four are used to record at different points in the room simultaneously. The signals are sent wirelessly to a computer connected to a server, where using MATLAB we calculate both the impulse response (IR…

Computer scienceAcoustics01 natural scienceslcsh:TechnologySet (abstract data type)lcsh:Chemistry030507 speech-language pathology & audiology03 medical and health sciencesWASNroom acousticsWirelessGeneral Materials ScienceMATLABInstrumentationlcsh:QH301-705.5Impulse responsecomputer.programming_languageFluid Flow and Transfer ProcessesMeasure (data warehouse)room parameters estimationbusiness.industrylcsh:TProcess Chemistry and Technology010401 analytical chemistryGeneral Engineeringspeech intelligibility indexRoom acousticslcsh:QC1-9990104 chemical sciencesComputer Science Applicationslcsh:Biology (General)lcsh:QD1-999Asynchronous communicationlcsh:TA1-2040impulse response0305 other medical scienceInternet of Thingsbusinesslcsh:Engineering (General). Civil engineering (General)computerlcsh:PhysicsApplied Sciences
researchProduct

Adaptive Distance-Based Pooling in Convolutional Neural Networks for Audio Event Classification

2020

In the last years, deep convolutional neural networks have become a standard for the development of state-of-the-art audio classification systems, taking the lead over traditional approaches based on feature engineering. While they are capable of achieving human performance under certain scenarios, it has been shown that their accuracy is severely degraded when the systems are tested over noisy or weakly segmented events. Although better generalization could be obtained by increasing the size of the training dataset, e.g. by applying data augmentation techniques, this also leads to longer and more complex training procedures. In this article, we propose a new type of pooling layer aimed at …

Feature engineeringAcoustics and Ultrasonicsbusiness.industryComputer scienceFeature vectorFeature extractionPoolingPattern recognitionConvolutional neural network030507 speech-language pathology & audiology03 medical and health sciencesComputational MathematicsTransformation (function)Feature (computer vision)Adaptive systemComputer Science (miscellaneous)Artificial intelligenceElectrical and Electronic Engineering0305 other medical sciencebusinessIEEE/ACM Transactions on Audio, Speech, and Language Processing
researchProduct

Wireless Acoustic Sensor Networks and Applications

2017

Article Subjectbusiness.industryComputer scienceComputer Networks and Communicationslcsh:T010401 analytical chemistryElectrical engineeringAcoustic sensor020206 networking & telecommunications02 engineering and technologyInformation Systems; Computer Networks and Communications; Electrical and Electronic Engineering01 natural scienceslcsh:Technology0104 chemical scienceslcsh:Telecommunicationlcsh:TK5101-67200202 electrical engineering electronic engineering information engineeringInformation systemWirelessElectrical and Electronic EngineeringbusinessInformation SystemsWireless Communications and Mobile Computing
researchProduct

Combining Inter-Subject Modeling with a Subject-Based Data Transformation to Improve Affect Recognition from EEG Signals

2019

Existing correlations between features extracted from Electroencephalography (EEG) signals and emotional aspects have motivated the development of a diversity of EEG-based affect detection methods. Both intra-subject and inter-subject approaches have been used in this context. Intra-subject approaches generally suffer from the small sample problem, and require the collection of exhaustive data for each new user before the detection system is usable. On the contrary, inter-subject models do not account for the personality and physiological influence of how the individual is feeling and expressing emotions. In this paper, we analyze both modeling approaches, using three public repositories. T…

Normalization (statistics)Data AnalysisSupport Vector MachineDatabases FactualComputer sciencemedia_common.quotation_subjectEmotionsData transformation (statistics)Context (language use)02 engineering and technologyvalence detectionElectroencephalographyAffect (psychology)Machine learningcomputer.software_genrelcsh:Chemical technologyBiochemistryModels BiologicalArticleAnalytical Chemistrydata transformation0202 electrical engineering electronic engineering information engineeringmedicinePersonalityHumanslcsh:TP1-1185EEGElectrical and Electronic EngineeringInstrumentationarousal detectionmedia_commonmedicine.diagnostic_testbusiness.industry020206 networking & telecommunicationsSubject (documents)ElectroencephalographySignal Processing Computer-AssistedAtomic and Molecular Physics and Opticsnormalization020201 artificial intelligence & image processingArtificial intelligencebusinessArousalcomputerSensors
researchProduct

Computation of Psycho-Acoustic Annoyance Using Deep Neural Networks

2019

Psycho-acoustic parameters have been extensively used to evaluate the discomfort or pleasure produced by the sounds in our environment. In this context, wireless acoustic sensor networks (WASNs) can be an interesting solution for monitoring subjective annoyance in certain soundscapes, since they can be used to register the evolution of such parameters in time and space. Unfortunately, the calculation of the psycho-acoustic parameters involved in common annoyance models implies a significant computational cost, and makes difficult the acquisition and transmission of these parameters at the nodes. As a result, monitoring psycho-acoustic annoyance becomes an expensive and inefficient task. Thi…

Computer scienceComputationsubjective annoyanceContext (language use)Annoyance02 engineering and technologycomputer.software_genre01 natural sciencesConvolutional neural networklcsh:TechnologyReduction (complexity)lcsh:Chemistryconvolutional neural networks0202 electrical engineering electronic engineering information engineeringWirelessGeneral Materials Sciencewireless acoustic sensor networksInstrumentationlcsh:QH301-705.5Fluid Flow and Transfer Processesbusiness.industrylcsh:TProcess Chemistry and Technology010401 analytical chemistryGeneral EngineeringRegression analysislcsh:QC1-9990104 chemical sciencesComputer Science Applicationspsycho-acoustic parametersTransmission (telecommunications)lcsh:Biology (General)lcsh:QD1-999lcsh:TA1-2040020201 artificial intelligence & image processingData miningbusinesslcsh:Engineering (General). Civil engineering (General)Zwicker modelcomputerlcsh:PhysicsApplied Sciences
researchProduct

A Bayesian direction-of-arrival model for an undetermined number of sources using a two-microphone array.

2014

Sound source localization using a two-microphone array is an active area of research, with considerable potential for use with video conferencing, mobile devices, and robotics. Based on the observed time-differences of arrival between sound signals, a probability distribution of the location of the sources is considered to estimate the actual source positions. However, these algorithms assume a given number of sound sources. This paper describes an updated research account on the solution presented in Escolano et al. [J. Acoust. Am. Soc. 132(3), 1257-1260 (2012)], where nested sampling is used to explore a probability distribution of the source position using a Laplacian mixture model, whic…

Microphone arrayAcoustics and UltrasonicsComputer scienceAcousticsBayesian probabilityDirection of arrivalSampling (statistics)DOAAcoustic source localizationMicrophone arraySpeech processingMixture modelBayesianSound source localizationArts and Humanities (miscellaneous)TEORIA DE LA SEÑAL Y COMUNICACIONESProbability distributionAlgorithmNested sampling algorithmThe Journal of the Acoustical Society of America
researchProduct

On the Use of a GPU-Accelerated Mobile Device Processor for Sound Source Localization

2017

Abstract The growing interest to incorporate new features into mobile devices has increased the number of signal processing applications running over processors designed for mobile computing. A challenging signal processing field is acoustic source localization, which is attractive for applications such as automatic camera steering systems, human-machine interfaces, video gaming or audio surveillance. In this context, the emergence of systems-on-chip (SoC) that contain a small graphics accelerator (or GPU), contributes a notable increment of the computational capacity while partially retaining the appealing low-power consumption of embedded systems. This is the case, for example, of the Sam…

020203 distributed computingSignal processingbusiness.industryComputer scienceReal-time computingMobile computing020206 networking & telecommunicationsContext (language use)02 engineering and technologyAcoustic source localizationcomputer.software_genreField (computer science)Power (physics)0202 electrical engineering electronic engineering information engineeringGeneral Earth and Planetary SciencesbusinessAudio signal processingMobile devicecomputerComputer hardwareGeneral Environmental ScienceProcedia Computer Science
researchProduct

Performance comparison of container orchestration platforms with low cost devices in the fog, assisting Internet of Things applications

2020

Abstract In the last decade there has been an increasing interest and demand on the Internet of Things (IoT) and its applications. But, when a high level of computing and/or real time processing is required for these applications, different problems arise due to their requirements. In this context, low cost autonomous and distributed Small Board Computers (SBC) devices, with processing, storage capabilities and wireless communications can assist these IoT networks. Usually, these SBC devices run an operating system based on Linux. In this scenario, container-based technologies and fog computing are an interesting approach and both have led to a new paradigm in how devices cooperate, improvi…

Computer Networks and CommunicationsComputer sciencebusiness.industryDistributed computing020206 networking & telecommunications02 engineering and technologyLoad balancing (computing)Virtualizationcomputer.software_genreNetwork topologyComputer Science ApplicationsHardware and ArchitecturePerformance comparisonScalability0202 electrical engineering electronic engineering information engineeringWireless020201 artificial intelligence & image processingOrchestration (computing)businessInternet of ThingscomputerJournal of Network and Computer Applications
researchProduct

Open Set Audio Classification Using Autoencoders Trained on Few Data.

2020

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solution…

Computer scienceOpen set02 engineering and technologylcsh:Chemical technologyMachine learningcomputer.software_genreBiochemistryArticleAnalytical ChemistrySet (abstract data type)open set recognition020204 information systemsaudio classificationautoencoders0202 electrical engineering electronic engineering information engineeringFeature (machine learning)lcsh:TP1-1185few-shot learningElectrical and Electronic EngineeringRepresentation (mathematics)Instrumentationbusiness.industryopen set classificationPerceptronClass (biology)AutoencoderAtomic and Molecular Physics and OpticsEmbedding020201 artificial intelligence & image processingArtificial intelligenceTransfer of learningbusinesscomputerSensors (Basel, Switzerland)
researchProduct

Enabling Real-Time Computation of Psycho-Acoustic Parameters in Acoustic Sensors Using Convolutional Neural Networks

2020

Sensor networks have become an extremely useful tool for monitoring and analysing many aspects of our daily lives. Noise pollution levels are very important today, especially in cities where the number of inhabitants and disturbing sounds are constantly increasing. Psycho-acoustic parameters are a fundamental tool for assessing the degree of discomfort produced by different sounds and, combined with wireless acoustic sensor networks (WASNs), could enable, for example, the efficient implementation of acoustic discomfort maps within smart cities. However, the continuous monitoring of psycho-acoustic parameters to create time-dependent discomfort maps requires a high computational demand that …

Audio signalComputer scienceNoise pollutionbusiness.industryComputation010401 analytical chemistryReal-time computing01 natural sciencesConvolutional neural network0104 chemical sciencesWirelessElectrical and Electronic EngineeringbusinessInstrumentationWireless sensor networkIEEE Sensors Journal
researchProduct

Cumulative-Sum-Based Localization of Sound Events in Low-Cost Wireless Acoustic Sensor Networks

2014

Wireless acoustic sensor networks (WASNs) are known for their potential applications in multiple areas, such as audio-based surveillance, binaural hearing aids or advanced acoustic monitoring. The knowledge of the spatial position of a source of interest is usually a requirement for many of these applications. Therefore, source localization is an important problem to be addressed in WASNs. Unfortunately, most localization algorithms need costly signal processing stages that prevent them from being implemented in low-cost sensor networks, requiring additional modules for signal acquisition and processing. This paper presents a low-complexity method for acoustic event detection and localizati…

Sound localizationSignal processingAcoustics and UltrasonicsComputer sciencebusiness.industrySpeech recognitionNode (networking)Real-time computingCUSUMComputational MathematicsSoftware deploymentComputer Science (miscellaneous)WirelessElectrical and Electronic EngineeringbusinessWireless sensor networkChange detectionIEEE/ACM Transactions on Audio, Speech, and Language Processing
researchProduct

Acoustic Scene Classification with Squeeze-Excitation Residual Networks

2020

Acoustic scene classification (ASC) is a problem related to the field of machine listening whose objective is to classify/tag an audio clip in a predefined label describing a scene location (e. g. park, airport, etc.). Many state-of-the-art solutions to ASC incorporate data augmentation techniques and model ensembles. However, considerable improvements can also be achieved only by modifying the architecture of convolutional neural networks (CNNs). In this work we propose two novel squeeze-excitation blocks to improve the accuracy of a CNN-based ASC framework based on residual learning. The main idea of squeeze-excitation blocks is to learn spatial and channel-wise feature maps independently…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningGeneral Computer ScienceCalibration (statistics)Computer scienceResidualConvolutional neural networkField (computer science)Computer Science - SoundMachine Learning (cs.LG)030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)Acoustic scene classificationFeature (machine learning)FOS: Electrical engineering electronic engineering information engineeringGeneral Materials ScienceBlock (data storage)Artificial neural networkbusiness.industrypattern recognitionGeneral Engineeringdeep learningPattern recognitionmachine listeningsqueeze-excitationArtificial intelligencelcsh:Electrical engineering. Electronics. Nuclear engineering0305 other medical sciencebusinesslcsh:TK1-9971Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

A Parallel Approach to HRTF Approximation and Interpolation Based on a Parametric Filter Model

2017

[EN] Spatial audio-rendering techniques using head-related transfer functions (HRTFs) are currently used in many different contexts such as immersive teleconferencing systems, gaming, or 3-D audio reproduction. Since all these applications usually involve real-time constraints, efficient processing structures for HRTF modeling and interpolation are necessary for providing real-time binaural audio solutions. This letter presents a parametric parallel model that allows us to perform HRTF filtering and interpolation efficiently from an input HRTF dataset. The resulting model, which is an adaptation from a recently proposed modeling technique, not only reduces the size of HRTF datasets signific…

Computer scienceparallel filters02 engineering and technologySolid modelingbinaural synthesisTransfer functionTECNOLOGIA ELECTRONICA030507 speech-language pathology & audiology03 medical and health sciencesgraphic processing unit (GPU)0202 electrical engineering electronic engineering information engineeringCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALhead-related transfer function (HRTF) modelingComputer visionElectrical and Electronic EngineeringAdaptation (computer science)Parametric statisticsbusiness.industryApplied MathematicsTeleconferenceBinaural synthesis020206 networking & telecommunicationsFilter (signal processing)interpolationInterpolationGraphic processing unit (GPU)Signal ProcessingHead-related transfer function (HRTF) modelingParallel filtersArtificial intelligence0305 other medical sciencebusinessAlgorithmInterpolation
researchProduct

Automatic Detection and Characterization of Acoustic Plane-Wave Reflections Using Circular Microphone Arrays

2013

The spatial characteristics of the sound field inside a room can be meaningfully described by means of microphone array processing techniques. In this context, the set of impulse responses sampled by a microphone array can be seen as an image made of acoustic plane-wave footprints. Due to the circular geometry of the microphone array, these footprints have a cosine-like shape that can be fully described as a function of the direction of arrival (DOA) of the impinging plane wave. This paper proposes a Hough-transform-based approach to plane-wave detection in microphone array multi-trace impulse responses. Experiments using a set of real microphone recordings are described, showing the potent…

Microphone arrayMicrophoneComputer scienceSpeech recognitionAcousticsPlane waveDirection of arrivalImpulse (physics)Hough transformlaw.inventionComputer Science::SoundlawNoise-canceling microphoneMicrophone array processing
researchProduct

AI-IoT Platform for Blind Estimation of Room Acoustic Parameters Based on Deep Neural Networks

2023

Room acoustical parameters have been widely used to describe sound perception in indoor environments, such as concert halls, conference rooms, etc. Many of them have been standardized and often have a high computational demand. With the increasing presence of deep learning approaches in automatic monitoring systems, wireless acoustic sensor networks (WASNs) offer great potential to facilitate the estimation of such parameters. In this scenario, Convolutional Neural Networks (CNNs) offer significant reductions in the computational requirements for in-node parameter predictions, enabling the so-called Artificial Intelligence-Internet of Things (AI-IoT). In this paper, we describe the design a…

InternetComputer Networks and CommunicationsHardware and ArchitectureInformàticaSignal ProcessingComputer Science ApplicationsInformation SystemsIEEE Internet of Things Journal
researchProduct

Game-based learning supported by audience response tools: game proposals and preliminary assessment

2018

The so-called game-based learning strategies are based on introducing games in the classrooms to improve aspects such as student performance, concentration and effort. Currently, they provide a very useful resource to increase the motivation of university students, generating a better atmosphere among peers and between student and teacher, which in turn is generally translated into better academic results. However, the design of games that successfully achieve the desired teaching-learning objectives is not a trivial task. This work focuses on the design of games that allow the assessment of ICT-related university subjects. Specifically, three different games are proposed, all based on stud…

Game designHigher educationEducational systemsGame based learning010501 environmental sciences01 natural sciencesUndergraduate educationGame design0502 economics and businessMathematics educationComputingMilieux_COMPUTERSANDEDUCATIONLearningGame-based learning0105 earth and related environmental sciencesOnline platformsbusiness.industryTeaching05 social sciencesUndergraduate educationHigher EducationPsychologybusiness050203 business & managementAudience responseEducational systems
researchProduct

Steered Response Power Localization of Acoustic Passband Signals

2017

The vast majority of localization approaches using phase transform (PHAT) consider that the sources of interest are wideband low-pass sources. While this may be the usual case for common audio signals such as speech, PHAT methods are affected negatively by modulation artifacts when the sources to be localized are passband signals. In these cases, steered response power PHAT localization becomes less robust. This letter analyzes the form of generalized cross-correlation functions with PHAT when passband acoustic signals are considered, proposing approaches for increasing the localization performance through the mitigation of these negative effects.

Audio signalComputer scienceApplied MathematicsSpeech recognitionAcousticsBandwidth (signal processing)020206 networking & telecommunications02 engineering and technology030507 speech-language pathology & audiology03 medical and health sciencesModulationSignal Processing0202 electrical engineering electronic engineering information engineeringElectrical and Electronic EngineeringWideband0305 other medical sciencePassbandIEEE Signal Processing Letters
researchProduct

On the Robustness of Deep Features for Audio Event Classification in Adverse Environments

2018

Deep features, responses to complex input patterns learned within deep neural networks, have recently shown great performance in image recognition tasks, motivating their use for audio analysis tasks as well. These features provide multiple levels of abstraction which permit to select a sufficiently generalized layer to identify classes not seen during training. The generalization capability of such features is very useful due to the lack of complete labeled audio datasets. However, as opposed to classical hand-crafted features such as Mel-frequency cepstral coefficients (MFCCs), the performance impact of having an acoustically adverse environment has not been evaluated in detail. In this p…

ReverberationNoise measurementComputer scienceSpeech recognitionFeature extraction02 engineering and technologyConvolutional neural network030507 speech-language pathology & audiology03 medical and health sciencesRaw audio formatRobustness (computer science)Audio analyzer0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingMel-frequency cepstrum0305 other medical science2018 14th IEEE International Conference on Signal Processing (ICSP)
researchProduct

A case study on feature sensitivity for audio event classification using support vector machines

2016

Automatic recognition of multiple acoustic events is an interesting problem in machine listening that generalizes the classical speech/non-speech or speech/music classification problem. Typical audio streams contain a diversity of sound events that carry important and useful information on the acoustic environment and context. Classification is usually performed by means of hidden Markov models (HMMs) or support vector machines (SVMs) considering traditional sets of features based on Mel-frequency cepstral coefficients (MFCCs) and their temporal derivatives, as well as the energy from auditory-inspired filterbanks. However, while these features are routinely used by many systems, it is not …

Machine listeningComputer sciencebusiness.industryEvent (computing)Speech recognitionFeature extractionContext (language use)Pattern recognition02 engineering and technologySupport vector machine030507 speech-language pathology & audiology03 medical and health sciencesComputingMethodologies_PATTERNRECOGNITION0202 electrical engineering electronic engineering information engineeringFeature (machine learning)020201 artificial intelligence & image processingArtificial intelligenceMel-frequency cepstrum0305 other medical sciencebusinessHidden Markov model2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)
researchProduct

Self-Localization of Distributed Microphone Arrays Using Directional Statistics with DoA Estimation Reliability

2019

This paper addresses the problem of self-localization of distributed microphone arrays from microphone recordings by following a two-step optimization procedure. In the first step, the relative geometry of the sources and arrays is inferred by the proposed maximum likelihood estimator. It is derived under the assumption that the acquired unit-norm vectors pointing towards the unknown source positions follow a von Mises-Fisher distribution in a D-dimensional space. In the second step, the absolute positions and synchronization offsets between the arrays are estimated from the inferred relative geometry by using the Least Squares procedure. To improve the accuracy of the method, we propose as…

MicrophoneComputer scienceDirectional statistics020206 networking & telecommunications02 engineering and technologySpace (mathematics)Least squaresMeasure (mathematics)SynchronizationDistribution (mathematics)0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingAlgorithmReliability (statistics)2019 27th European Signal Processing Conference (EUSIPCO)
researchProduct

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

2020

The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutio…

FOS: Computer and information sciencesSound (cs.SD)Computer sciencePhase (waves)Distributed microphones02 engineering and technologyConvolutional neural networkComputer Science - Sound030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringGCCRepresentation (mathematics)Signal processingbusiness.industryI.5.4Deep learningConvolutional Neural Networks020206 networking & telecommunicationsTime delay estimationMultilaterationI.2.094A12 68T10LocalizationArtificial intelligence0305 other medical sciencebusinessAlgorithmElectrical Engineering and Systems Science - Audio and Speech ProcessingI.2.0; I.5.4ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
researchProduct

Practical considerations for acoustic source localization in the IoT era: Platforms, energy efficiency, and performance

2019

The rapid development of the Internet of Things (IoT) has posed important changes in the way emerging acoustic signal processing applications are conceived. While traditional acoustic processing applications have been developed taking into account high-throughput computing platforms equipped with expensive multichannel audio interfaces, the IoT paradigm is demanding the use of more flexible and energy-efficient systems. In this context, algorithms for source localization and ranging in wireless acoustic sensor networks can be considered an enabling technology for many IoT-based environments, including security, industrial, and health-care applications. This paper is aimed at evaluating impo…

Computer Networks and CommunicationsComputer scienceDistributed computingContext (language use)02 engineering and technologyParallel architectures0202 electrical engineering electronic engineering information engineeringParallel processingWirelessSignal processingMulti-core processorHeterogeneous (hybrid) systemsbusiness.industry020206 networking & telecommunicationsAcoustic source localizationWireless acoustic sensor networks (WASNs)Computer Science ApplicationsEnergy efficiencyHardware and ArchitectureSignal Processing020201 artificial intelligence & image processingElectrónicabusinessWireless sensor networkSource localizationInformation SystemsEfficient energy useAcoustic signal processing
researchProduct

A Robust Wrap Reduction Algorithm for Fringe Projection Profilometry and Applications in Magnetic Resonance Imaging.

2017

In this paper, we present an effective algorithm to reduce the number of wraps in a 2D phase signal provided as input. The technique is based on an accurate estimate of the fundamental frequency of a 2D complex signal with the phase given by the input, and the removal of a dependent additive term from the phase map. Unlike existing methods based on the discrete Fourier transform (DFT), the frequency is computed by using noise-robust estimates that are not restricted to integer values. Then, to deal with the problem of a non-integer shift in the frequency domain, an equivalent operation is carried out on the original phase signal. This consists of the subtraction of a tilted plane whose slop…

Non-uniform discrete Fourier transformSpectral density estimation020206 networking & telecommunicationsk-space02 engineering and technologyFundamental frequency01 natural sciencesComputer Graphics and Computer-Aided DesignSignalDiscrete Fourier transform010309 opticsFrequency domain0103 physical sciencesDiscrete frequency domain0202 electrical engineering electronic engineering information engineeringAlgorithmSoftwareMathematicsIEEE transactions on image processing : a publication of the IEEE Signal Processing Society
researchProduct

Analysis of data fusion techniques for multi-microphone audio event detection in adverse environments

2017

Acoustic event detection (AED) is currently a very active research area with multiple applications in the development of smart acoustic spaces. In this context, the advances brought by Internet of Things (IoT) platforms where multiple distributed microphones are available have also contributed to this interest. In such scenarios, the use of data fusion techniques merging information from several sensors becomes an important aspect in the design of multi-microphone AED systems. In this paper, we present a preliminary analysis of several data-fusion techniques aimed at improving the recognition accuracy of an AED system by taking advantage of the diversity provided by multiple microphones in …

Noise measurementEvent (computing)MicrophoneComputer scienceReal-time computingFeature extractionContext (language use)02 engineering and technologycomputer.software_genreSensor fusion030507 speech-language pathology & audiology03 medical and health sciences0202 electrical engineering electronic engineering information engineeringData analysis020201 artificial intelligence & image processing0305 other medical sciencecomputerData integration2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)
researchProduct

Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings

2013

Close-microphone techniques are extensively employed in many live music recordings, allowing for interference rejection and reducing the amount of reverberation in the resulting instrument tracks. However, despite the use of directional microphones, the recorded tracks are not completely free from source interference, a problem which is commonly known as microphone leakage. While source separation methods are potentially a solution to this problem, few approaches take into account the huge amount of prior information available in this scenario. In fact, besides the special properties of close-microphone tracks, the knowledge on the number and type of instruments making up the mixture can al…

ReverberationInstruments musicalsComputer sciencebusiness.industryMicrophoneMúsica -- InformàticaSignalNon-negative matrix factorizationSet (abstract data type)FactorizationInterference (communication)Source separationComputer visionArtificial intelligenceMicròfonsbusinessEURASIP Journal on Advances in Signal Processing
researchProduct

Fast Channel Estimation in the Transformed Spatial Domain for Analog Millimeter Wave Systems

2021

Fast channel estimation in millimeter-wave (mmWave) systems is a fundamental enabler of high-gain beamforming, which boosts coverage and capacity. The channel estimation stage typically involves an initial beam training process where a subset of the possible beam directions at the transmitter and receiver is scanned along a predefined codebook. Unfortunately, the high number of transmit and receive antennas deployed in mmWave systems increase the complexity of the beam selection and channel estimation tasks. In this work, we tackle the channel estimation problem in analog systems from a different perspective than used by previous works. In particular, we propose to move the channel estimati…

Signal Processing (eess.SP)FOS: Computer and information sciencesBeamformingComputational complexity theoryComputer scienceComputer Science - Information TheoryInformation Theory (cs.IT)Applied MathematicsTransmitterCodebookDirection of arrivalComputer Science ApplicationsTelecomunicaciósymbols.namesakeAdditive white Gaussian noiseTecnologiaRobustness (computer science)FOS: Electrical engineering electronic engineering information engineeringsymbolsElectrical Engineering and Systems Science - Signal ProcessingElectrical and Electronic EngineeringAlgorithmComputer Science::Information TheoryCommunication channelIEEE Transactions on Wireless Communications
researchProduct

Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation

2020

Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes. This problem can be solved with an unsupervised ASD solution since industrial machines will not be damaged simply by having this audio data in the training stage. This paper proposes a novel framework based on convolutional autoencoders (both unsupervised and semi-supervised) and a Gammatone-base…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Combinación de cuestionarios simples y gamificados utilizando gestores de participación en el aula: experiencia y percepción del alumnado

2017

[EN] The growing use of mobile devices has motivated the development of a wide range of applications to help manage the students’ participation in the classroom. Socrative allows the lecturer to use multiple-choice questionnaires in the classroom, either in a simple or a gamified mode (Space Race). In this paper, we describe our experience at using this tool to promote competitive learning, at both undergraduate and post-graduate levels. The student’s perception indicates that the use of the application helped at increasing engagement and motivation. However, relevant differences were found between both modes of use, underlining the importance of an adequate activity design.

Innovación educativaGamificaciónGestor de participaciónSocrativeEducación superiorEnseñanza superiorTecnologías y educación
researchProduct

Sound Event Envelope Estimation in Polyphonic Mixtures

2019

Sound event detection is the task of identifying automatically the presence and temporal boundaries of sound events within an input audio stream. In the last years, deep learning methods have established themselves as the state-of-the-art approach for the task, using binary indicators during training to denote whether an event is active or inactive. However, such binary activity indicators do not fully describe the events, and estimating the envelope of the sounds could provide more precise modeling of their activity. This paper proposes to estimate the amplitude envelopes of target sound event classes in polyphonic mixtures. For training, we use the amplitude envelopes of the target sounds…

geographygeography.geographical_feature_categoryComputer scienceSpeech recognition02 engineering and technology113 Computer and information sciencesTask (project management)030507 speech-language pathology & audiology03 medical and health sciencesAmplitudeSignal-to-noise ratio0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingPolyphony0305 other medical scienceSound (geography)Envelope (motion)Event (probability theory)
researchProduct

Spatio-Temporal Analysis of Urban Acoustic Environments with Binaural Psycho-Acoustical Considerations for IoT-Based Applications

2018

Sound pleasantness or annoyance perceived in urban soundscapes is a major concern in environmental acoustics. Binaural psychoacoustic parameters are helpful to describe generic acoustic environments, as it is stated within the ISO 12913 framework. In this paper, the application of a Wireless Acoustic Sensor Network (WASN) to evaluate the spatial distribution and the evolution of urban acoustic environments is described. Two experiments are presented using an indoor and an outdoor deployment of a WASN with several nodes using an Internet of Things (IoT) environment to collect audio data and calculate meaningful parameters such as the sound pressure level, binaural loudness and binaural sharp…

SoundscapeMicrophone arrayIoTComputer sciencesoundscapeBinauralReal-time computingInternet of ThingsAnnoyance02 engineering and technologylcsh:Chemical technology01 natural sciencesBiochemistryArticleAnalytical ChemistryLoudnessspatial statisticsWASN0202 electrical engineering electronic engineering information engineeringlcsh:TP1-1185PsychoacousticsElectrical and Electronic EngineeringAcousticSound pressureInstrumentationacoustic environment010401 analytical chemistry020206 networking & telecommunicationspsychoacousticsLoudnessAtomic and Molecular Physics and Optics0104 chemical sciencesacoustic environment; soundscape; WASN; psychoacoustics; IoT; spatial statisticsSmart CitiesBinaural recordingSensors; Volume 18; Issue 3; Pages: 690
researchProduct

Adaptive Mid-Term Representations for Robust Audio Event Classification

2018

Low-level audio features are commonly used in many audio analysis tasks, such as audio scene classification or acoustic event detection. Due to the variable length of audio signals, it is a common approach to create fixed-length feature vectors consisting of a set of statistics that summarize the temporal variability of such short-term features. To avoid the loss of temporal information, the audio event can be divided into a set of mid-term segments or texture windows. However, such an approach requires to estimate accurately the onset and offset times of the audio events in order to obtain a robust mid-term statistical description of their temporal evolution. This paper proposes the use of…

Audio signalAcoustics and UltrasonicsComputer sciencebusiness.industryFeature vectorPattern recognition01 natural sciences030507 speech-language pathology & audiology03 medical and health sciencesComputational MathematicsNonlinear systemFraming (construction)Acoustic event detection0103 physical sciencesAudio analyzerComputer Science (miscellaneous)SegmentationArtificial intelligenceElectrical and Electronic Engineering0305 other medical sciencebusiness010301 acousticsTemporal informationIEEE/ACM Transactions on Audio, Speech, and Language Processing
researchProduct

Real-time Sound Source Localization on Graphics Processing Units

2013

Abstract Sound source localization is an important topic in microphone array signal processing applications, such as camera steering systems, human-machine interaction or surveillance systems. The Steered Response Power with Phase Transform (SRP- PHAT) algorithm is one of the most well-known approaches for sound source localization due to its good performance in noisy and reverberant environments. The algorithm analyzes the sound power captured by a microphone array on a grid of spatial points in a given room. While localization accuracy can be improved by using a high resolution spatial grid and a high number of microphones, performing the localization task in these circumstances requires …

Microphone arrayCoprocessorComputer sciencebusiness.industryAudio ProcessingGPUMicrophone ArraysAcoustic source localizationSound powerGridcomputer.software_genreSound Source LocalizationComputational scienceGeneral Earth and Planetary SciencesGraphicsbusinessAudio signal processingcomputerComputer hardwareGeneral Environmental ScienceProcedia Computer Science
researchProduct

Improving Isolation of Blindly Separated Sources Using Time-Frequency Masking

2008

A refinement technique based on time-frequency masking is proposed to improve source isolation in blind audio source separation algorithms. The refinement technique uses an energy-normalized source-to-interference ratio in order to identify and eliminate interfering energy from the extracted sources. Some examples using this refinement method with different separation algorithms are discussed. The results show that source isolation can be significantly enhanced with negligible degradation of the separated sources.

Masking (art)business.industryComputer scienceApplied MathematicsSpeech recognitionPattern recognitioncomputer.software_genreBlind signal separationIndependent component analysisTime–frequency analysisSignal ProcessingSource separationArtificial intelligenceIsolation (database systems)Electrical and Electronic EngineeringAudio signal processingbusinesscomputerEnergy (signal processing)IEEE Signal Processing Letters
researchProduct

CNN depth analysis with different channel inputs for Acoustic Scene Classification

2019

Acoustic scene classification (ASC) has been approached in the last years using deep learning techniques such as convolutional neural networks or recurrent neural networks. Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks. Finding the most suitable audio representation is still a research area of interest. In this paper, different log-Mel representations and combinations are analyzed. Experiments show that the best results are obtained using the harmonic and percussive components plus the difference between left and right stereo channels, (L-R). On the other hand, it i…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineeringComputer Science - SoundMachine Learning (cs.LG)Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

On the Design of Probe Signals in Wireless Acoustic Sensor Networks Self-Positioning Algorithms

2018

A wireless acoustic sensor network comprises a distributed group of devices equipped with audio transducers. Typically, these devices can interoperate with each other using wireless links and perform collaborative audio signal processing. Ranging and self-positioning of the network nodes are examples of tasks that can be carried out collaboratively using acoustic signals. However, the environmental conditions can distort the emitted signals and complicate the ranging process. In this context, the selection of proper acoustic signals can facilitate the attainment of this goal and improve the localization accuracy. This letter deals with the design and evaluation of acoustic probe signals all…

Audio signalComputer sciencebusiness.industryApplied Mathematics020208 electrical & electronic engineeringReal-time computingBandwidth (signal processing)020206 networking & telecommunicationsRanging02 engineering and technologycomputer.software_genreTransducerSignal Processing0202 electrical engineering electronic engineering information engineeringChirpWirelessElectrical and Electronic EngineeringAudio signal processingbusinessFrequency modulationcomputerIEEE Signal Processing Letters
researchProduct

Design and Implementation of Acoustic Source Localization on a Low-Cost IoT Edge Platform

2020

The implementation of algorithms for acoustic source localization on edge platforms for the Internet of Things (IoT) is gaining momentum. Applications based on acoustic monitoring can greatly benefit from efficient implementations of such algorithms, enabling novel services for smart homes and buildings or ambient-assisted living. In this context, this brief proposes extreme low-cost sound source localization system composed of two microphones and the low power microcontroller module ESP32. A Direction-Of-Arrival (DOA) algorithm has been implemented taking into account the specific features of this board, showing excellent performance despite the memory constraints imposed by the platform. …

business.industryMicrophoneComputer scienceEmbedded systemsProcess (computing)Esp32Context (language use)Acoustic source localizationIngeniería IndustrialTime–frequency analysisSound source localizationMicrocontrollerDoa estimationEnhanced Data Rates for GSM EvolutionElectrical and Electronic EngineeringbusinessImplementationComputer hardware
researchProduct

Low-Cost Alternatives for Urban Noise Nuisance Monitoring Using Wireless Sensor Networks

2015

Noise pollution caused by vehicular traffic is a common problem in urban environments that has been shown to affect people's health and children's cognition. In the last decade, several studies have been conducted to assess this noise, by measuring the equivalent noise pressure level (called L eq ) to acquire an accurate sound map using wireless networks with acoustic sensors. However, even with similar values of L eq , people can feel the noise differently according to its frequency characteristics. Thus, indexes, which can express people's feelings by subjective measures, are required. In this paper, we analyze the suitability of using the psychoacoustic metrics given by the Zwicker's mod…

Engineeringbusiness.industryWireless networkNoise pollutionReal-time computingAnnoyanceBackground noiseNoiseElectronic engineeringWirelessPsychoacousticsElectrical and Electronic EngineeringbusinessInstrumentationWireless sensor networkIEEE Sensors Journal
researchProduct