Benchmarking Approximate Inference Methods for Neural Structured
Prediction
release_jqzp6lmf55hfjb3wzud4vm7ani
by
Lifu Tu, Kevin Gimpel
2019
Abstract
Exact structured inference with neural network scoring functions is
computationally challenging but several methods have been proposed for
approximating inference. One approach is to perform gradient descent with
respect to the output structure directly (Belanger and McCallum, 2016). Another
approach, proposed recently, is to train a neural network (an "inference
network") to perform inference (Tu and Gimpel, 2018). In this paper, we compare
these two families of inference methods on three sequence labeling datasets. We
choose sequence labeling because it permits us to use exact inference as a
benchmark in terms of speed, accuracy, and search error. Across datasets, we
demonstrate that inference networks achieve a better speed/accuracy/search
error trade-off than gradient descent, while also being faster than exact
inference at similar accuracy levels. We find further benefit by combining
inference networks and gradient descent, using the former to provide a warm
start for the latter.
In text/plain
format
Archived Files and Locations
application/pdf 206.5 kB
file_ufk4e3nsfvgrxjvey4vqaotvj4
|
arxiv.org (repository) web.archive.org (webarchive) |
1904.01138v2
access all versions, variants, and formats of this works (eg, pre-prints)