0000000000452764

AUTHOR

Augusto Sarti

showing 3 related works from this author

Ray-Space-Based Multichannel Nonnegative Matrix Factorization for Audio Source Separation

2021

Nonnegative matrix factorization (NMF) has been traditionally considered a promising approach for audio source separation. While standard NMF is only suited for single-channel mixtures, extensions to consider multi-channel data have been also proposed. Among the most popular alternatives, multichannel NMF (MNMF) and further derivations based on constrained spatial covariance models have been successfully employed to separate multi-microphone convolutive mixtures. This letter proposes a MNMF extension by considering a mixture model with Ray-Space-transformed signals, where magnitude data successfully encodes source locations as frequency-independent linear patterns. We show that the MNMF alg…

Covariance functionComputer scienceApplied Mathematics020206 networking & telecommunications02 engineering and technologyExtension (predicate logic)Mixture modelMatrix decompositionNon-negative matrix factorizationTime–frequency analysisblind source separationSignal Processing0202 electrical engineering electronic engineering information engineeringSource separationNon -negative matrix factorization (NMF)array signal processingElectrical and Electronic EngineeringAlgorithmIEEE Signal Processing Letters
researchProduct

Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach

2020

The generalized cross correlation (GCC) is regarded as the most popular approach for estimating the time difference of arrival (TDOA) between the signals received at two sensors. Time delay estimates are obtained by maximizing the GCC output, where the direct-path delay is usually observed as a prominent peak. Moreover, GCCs play also an important role in steered response power (SRP) localization algorithms, where the SRP functional can be written as an accumulation of the GCCs computed from multiple sensor pairs. Unfortunately, the accuracy of TDOA estimates is affected by multiple factors, including noise, reverberation and signal bandwidth. In this paper, a sub-band approach for time del…

Reverberationweighted SVDAcoustics and UltrasonicsCross-correlationComputer scienceNoise (signal processing)SRP-PHATMatrix representationTime delay estimationMultilaterationComputational Mathematicssub-band processingAudio and Speech Processing (eess.AS)Temporal resolutionSingular value decompositionComputer Science (miscellaneous)FOS: Electrical engineering electronic engineering information engineeringGCCElectrical and Electronic EngineeringRepresentation (mathematics)SVDAlgorithmElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

2020

The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutio…

FOS: Computer and information sciencesSound (cs.SD)Computer sciencePhase (waves)Distributed microphones02 engineering and technologyConvolutional neural networkComputer Science - Sound030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)FOS: Electrical engineering electronic engineering information engineering0202 electrical engineering electronic engineering information engineeringGCCRepresentation (mathematics)Signal processingbusiness.industryI.5.4Deep learningConvolutional Neural Networks020206 networking & telecommunicationsTime delay estimationMultilaterationI.2.094A12 68T10LocalizationArtificial intelligence0305 other medical sciencebusinessAlgorithmElectrical Engineering and Systems Science - Audio and Speech ProcessingI.2.0; I.5.4ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
researchProduct