Variational Parametric Models for Audio Synthesis
release_qd6oia3dkvgynf4mzwo4466bzu
by
Krishna Subramani, Preeti Rao
2020
Abstract
With the advent of data-driven statistical modeling and abundant computing power, researchers are turning increasingly to deep learning for audio synthesis. These methods try to model audio signals directly in the time or frequency domain. In the interest of more flexible control over the generated sound, it could be more useful to work with a parametric representation of the signal which corresponds more directly to the musical attributes such as pitch, dynamics and timbre. These parametric representations also facilitate better musical control of the synthesized output. We present <strong>VaPar Synth</strong> - a Variational Parametric Synthesizer which utilizes a conditional variational autoencoder trained on a suitable parametric representation. We demonstrate our proposed model's capabilities via the reconstruction and generation of instrumental tones with flexible control over their pitch. We also investigate a parametric model for violin tones, in particular, the generative modeling of the residual bow noise to make for more natural tone quality. To aid in our analysis, we introduce a dataset of Carnatic Violin Recordings where bow noise is an integral part of the playing style of higher-pitched notes in specific gestural contexts. We obtain insights about each of the harmonic and residual components of the signal, as well as their interdependence, via observations on the latent space derived in the course of variational encoding of the spectral envelopes of the sustained sounds.
In text/plain
format
Archived Files and Locations
application/pdf 5.5 MB
file_lbmbmembineh3ku6z3pscyrzi4
|
zenodo.org (repository) web.archive.org (webarchive) |
access all versions, variants, and formats of this works (eg, pre-prints)
Datacite Metadata (via API)
Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar