Unsupervised Source Separation via Bayesian Inference in the Latent Domain release_7qm2ev64vbcf5oqmldhhcrdyeq

by Michele Mancusi, Emilian Postolache, Marco Fumero, Andrea Santilli, Luca Cosmo, Emanuele Rodolà

Released as a article .

2021  

Abstract

State of the art audio source separation models rely on supervised data-driven approaches, which can be expensive in terms of labeling resources. On the other hand, approaches for training these models without any direct supervision are typically high-demanding in terms of memory and time requirements, and remain impractical to be used at inference time. We aim to tackle these limitations by proposing a simple yet effective unsupervised separation algorithm, which operates directly on a latent representation of time-domain signals. Our algorithm relies on deep Bayesian priors in the form of pre-trained autoregressive networks to model the probability distributions of each source. We leverage the low cardinality of the discrete latent space, trained with a novel loss term imposing a precise arithmetic structure on it, to perform exact Bayesian inference without relying on an approximation strategy. We validate our approach on the Slakh dataset arXiv:1909.08494, demonstrating results in line with state of the art supervised approaches while requiring fewer resources with respect to other unsupervised methods.
In text/plain format

Archived Files and Locations

application/pdf  444.2 kB
file_2jotqmyae5fqxd5jiwcxzw4hhi
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2021-10-11
Version   v1
Language   en ?
arXiv  2110.05313v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: e95a37cd-e8f9-4f6a-88b5-5c6a59a5c350
API URL: JSON