Constructing Long Short-Term Memory based Deep Recurrent Neural Networks
for Large Vocabulary Speech Recognition
release_o7yftzegi5bozghecotv44d64u
by
Xiangang Li, Xihong Wu
2015
Abstract
Long short-term memory (LSTM) based acoustic modeling methods have recently
been shown to give state-of-the-art performance on some speech recognition
tasks. To achieve a further performance improvement, in this research, deep
extensions on LSTM are investigated considering that deep hierarchical model
has turned out to be more efficient than a shallow one. Motivated by previous
research on constructing deep recurrent neural networks (RNNs), alternative
deep LSTM architectures are proposed and empirically evaluated on a large
vocabulary conversational telephone speech recognition task. Meanwhile,
regarding to multi-GPU devices, the training process for LSTM networks is
introduced and discussed. Experimental results demonstrate that the deep LSTM
networks benefit from the depth and yield the state-of-the-art performance on
this task.
In text/plain
format
Archived Files and Locations
application/pdf 126.9 kB
file_5orykhnpprgv3c2eztdidb7k54
|
arxiv.org (repository) web.archive.org (webarchive) |
1410.4281v2
access all versions, variants, and formats of this works (eg, pre-prints)