6533b860fe1ef96bd12c3b0a
RESEARCH PRODUCT
Sound Event Envelope Estimation in Polyphonic Mixtures
Tuomas VirtanenIrene Martin-moratoFrancesc J. FerriAnnamaria MesarosMaximo CobosToni Heittolasubject
geographygeography.geographical_feature_categoryComputer scienceSpeech recognition02 engineering and technology113 Computer and information sciencesTask (project management)030507 speech-language pathology & audiology03 medical and health sciencesAmplitudeSignal-to-noise ratio0202 electrical engineering electronic engineering information engineering020201 artificial intelligence & image processingPolyphony0305 other medical scienceSound (geography)Envelope (motion)Event (probability theory)description
Sound event detection is the task of identifying automatically the presence and temporal boundaries of sound events within an input audio stream. In the last years, deep learning methods have established themselves as the state-of-the-art approach for the task, using binary indicators during training to denote whether an event is active or inactive. However, such binary activity indicators do not fully describe the events, and estimating the envelope of the sounds could provide more precise modeling of their activity. This paper proposes to estimate the amplitude envelopes of target sound event classes in polyphonic mixtures. For training, we use the amplitude envelopes of the target sounds, calculated from mixture signals and, for comparison, from their isolated counterparts. The model is then used to perform envelope estimation and sound event detection. Results show that the envelope estimation allows good modeling of the sounds activity, with detection results comparable to current state-of-the art. acceptedVersion Peer reviewed
year | journal | country | edition | language |
---|---|---|---|---|
2019-05-01 |