0000000000452766

AUTHOR

Fabio Antonacci

showing 5 related works from this author

Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation

2021

Nonnegative matrix factorization (NMF) has been traditionally considered a promising approach for audio source separation. While standard NMF is only suited for single-channel mixtures, extensions to consider multi-channel data have been also proposed. Among the most popular alternatives, multichannel NMF (MNMF) and further derivations based on constrained spatial covariance models have been successfully employed to separate multi-microphone convolutive mixtures. This letter proposes a MNMF extension by considering a mixture model with Ray-Space-transformed signals, where magnitude data successfully encodes source locations as frequency-independent linear patterns. We show that the MNMF alg…

Covariance functionComputer scienceApplied Mathematics020206 networking & telecommunications02 engineering and technologyExtension (predicate logic)Mixture modelMatrix decompositionNon-negative matrix factorizationTime–frequency analysisblind source separationSignal Processing0202 electrical engineering electronic engineering information engineeringSource separationNon -negative matrix factorization (NMF)array signal processingElectrical and Electronic EngineeringAlgorithmIEEE Signal Processing Letters
researchProduct

Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach

2020

The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, where the SRP functional can be written as an accumulation of the GCCs computed from multiple sensor pairs. Unfortunately, the accuracy of TDOA estimates is affected by multiple factors, including noise, reverberation and signal bandwidth. In this paper, a sub-band approach for time del…

Reverberationweighted SVDAcoustics and UltrasonicsCross-correlationComputer scienceNoise (signal processing)SRP-PHATMatrix representationTime delay estimationMultilaterationComputational Mathematicssub-band processingAudio and Speech Processing (eess.AS)Temporal resolutionSingular value decompositionComputer Science (miscellaneous)FOS: Electrical engineering electronic engineering information engineeringGCCElectrical and Electronic EngineeringRepresentation (mathematics)SVDAlgorithmElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Wireless Acoustic Sensor Networks and Applications

2017

Article Subjectbusiness.industryComputer scienceComputer Networks and Communicationslcsh:T010401 analytical chemistryElectrical engineeringAcoustic sensor020206 networking & telecommunications02 engineering and technologyInformation Systems; Computer Networks and Communications; Electrical and Electronic Engineering01 natural scienceslcsh:Technology0104 chemical scienceslcsh:Telecommunicationlcsh:TK5101-67200202 electrical engineering electronic engineering information engineeringInformation systemWirelessElectrical and Electronic EngineeringbusinessInformation SystemsWireless Communications and Mobile Computing
researchProduct

Open Set Audio Classification Using Autoencoders Trained on Few Data.

2020

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solution…

Computer scienceOpen set02 engineering and technologylcsh:Chemical technologyMachine learningcomputer.software_genreBiochemistryArticleAnalytical ChemistrySet (abstract data type)open set recognition020204 information systemsaudio classificationautoencoders0202 electrical engineering electronic engineering information engineeringFeature (machine learning)lcsh:TP1-1185few-shot learningElectrical and Electronic EngineeringRepresentation (mathematics)Instrumentationbusiness.industryopen set classificationPerceptronClass (biology)AutoencoderAtomic and Molecular Physics and OpticsEmbedding020201 artificial intelligence & image processingArtificial intelligenceTransfer of learningbusinesscomputerSensors (Basel, Switzerland)
researchProduct

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

2020

The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutio…

FOS: Computer and information sciencesSound (cs.SD)Computer sciencePhase (waves)Distributed microphones02 engineering and technologyConvolutional neural networkComputer Science - Sound030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringGCCRepresentation (mathematics)Signal processingbusiness.industryI.5.4Deep learningConvolutional Neural Networks020206 networking & telecommunicationsTime delay estimationMultilaterationI.2.094A12 68T10LocalizationArtificial intelligence0305 other medical sciencebusinessAlgorithmElectrical Engineering and Systems Science - Audio and Speech ProcessingI.2.0; I.5.4ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
researchProduct