PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
release_7qdz72u3qzcsvdnyphmkhl2tdy
by
Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, James Glass
2021
Abstract
Recent work on speech self-supervised learning (speech SSL) demonstrated the
benefits of scale in learning rich and transferable representations for
Automatic Speech Recognition (ASR) with limited parallel data. It is then
natural to investigate the existence of sparse and transferrable subnetworks in
pre-trained speech SSL models that can achieve even better low-resource ASR
performance. However, directly applying widely adopted pruning methods such as
the Lottery Ticket Hypothesis (LTH) is suboptimal in the computational cost
needed. Moreover, contrary to what LTH predicts, the discovered subnetworks
yield minimal performance gain compared to the original dense network. In this
work, we propose Prune-Adjust- Re-Prune (PARP), which discovers and finetunes
subnetworks for much better ASR performance, while only requiring a single
downstream finetuning run. PARP is inspired by our surprising observation that
subnetworks pruned for pre-training tasks only needed to be slightly adjusted
to achieve a sizeable performance boost in downstream ASR tasks. Extensive
experiments on low-resource English and multi-lingual ASR show (1) sparse
subnetworks exist in pre-trained speech SSL, and (2) the computational
advantage and performance gain of PARP over baseline pruning methods. On the
10min Librispeech split without LM decoding, PARP discovers subnetworks from
wav2vec 2.0 with an absolute 10.9%/12.6% WER decrease compared to the full
model. We demonstrate PARP mitigates performance degradation in cross-lingual
mask transfer, and investigate the possibility of discovering a single
subnetwork for 10 spoken languages in one run.
In text/plain
format
Archived Files and Locations
application/pdf 12.4 MB
file_rjtcfc7h2jbmbp76t5z5xdgcuu
|
arxiv.org (repository) web.archive.org (webarchive) |
2106.05933v1
access all versions, variants, and formats of this works (eg, pre-prints)