Learning Multi-instrument Classification with Partial Labels
release_b6thuwjt5vbvfmc5hb5wcmftwa
by
Amir Kenarsari Anhari
2020
Abstract
Multi-instrument recognition is the task of predicting the presence or
absence of different instruments within an audio clip. A considerable challenge
in applying deep learning to multi-instrument recognition is the scarcity of
labeled data. OpenMIC is a recent dataset containing 20K polyphonic audio
clips. The dataset is weakly labeled, in that only the presence or absence of
instruments is known for each clip, while the onset and offset times are
unknown. The dataset is also partially labeled, in that only a subset of
instruments are labeled for each clip.
In this work, we investigate the use of attention-based recurrent neural
networks to address the weakly-labeled problem. We also use different data
augmentation methods to mitigate the partially-labeled problem. Our experiments
show that our approach achieves state-of-the-art results on the OpenMIC
multi-instrument recognition task.
In text/plain
format
Archived Files and Locations
application/pdf 148.1 kB
file_covzym3p6veojg7n2wccr2aedm
|
arxiv.org (repository) web.archive.org (webarchive) |
2001.08864v1
access all versions, variants, and formats of this works (eg, pre-prints)