Multi-Mention Learning for Reading Comprehension with Neural Cascades
release_jwo3ehsy3nca5kbapef6yeehwa
by
Swabha Swayamdipta, Ankur P. Parikh, Tom Kwiatkowski
2017
Abstract
Reading comprehension is a challenging task, especially when executed across
longer or across multiple evidence documents, where the answer is likely to
reoccur. Existing neural architectures typically do not scale to the entire
evidence, and hence, resort to selecting a single passage in the document
(either via truncation or other means), and carefully searching for the answer
within that passage. However, in some cases, this strategy can be suboptimal,
since by focusing on a specific passage, it becomes difficult to leverage
multiple mentions of the same answer throughout the document. In this work, we
take a different approach by constructing lightweight models that are combined
in a cascade to find the answer. Each submodel consists only of feed-forward
networks equipped with an attention mechanism, making it trivially
parallelizable. We show that our approach can scale to approximately an order
of magnitude larger evidence documents and can aggregate information at the
representation level from multiple mentions of each answer candidate across the
document. Empirically, our approach achieves state-of-the-art performance on
both the Wikipedia and web domains of the TriviaQA dataset, outperforming more
complex, recurrent architectures.
In text/plain
format
Archived Files and Locations
application/pdf 495.7 kB
file_fl23fwwegndjtj5lnmpumhg6ay
|
arxiv.org (repository) web.archive.org (webarchive) |
1711.00894v1
access all versions, variants, and formats of this works (eg, pre-prints)