Multi-Mention Learning for Reading Comprehension with Neural Cascades release_jwo3ehsy3nca5kbapef6yeehwa

by Swabha Swayamdipta, Ankur P. Parikh, Tom Kwiatkowski

Released as a article .

2017  

Abstract

Reading comprehension is a challenging task, especially when executed across longer or across multiple evidence documents, where the answer is likely to reoccur. Existing neural architectures typically do not scale to the entire evidence, and hence, resort to selecting a single passage in the document (either via truncation or other means), and carefully searching for the answer within that passage. However, in some cases, this strategy can be suboptimal, since by focusing on a specific passage, it becomes difficult to leverage multiple mentions of the same answer throughout the document. In this work, we take a different approach by constructing lightweight models that are combined in a cascade to find the answer. Each submodel consists only of feed-forward networks equipped with an attention mechanism, making it trivially parallelizable. We show that our approach can scale to approximately an order of magnitude larger evidence documents and can aggregate information at the representation level from multiple mentions of each answer candidate across the document. Empirically, our approach achieves state-of-the-art performance on both the Wikipedia and web domains of the TriviaQA dataset, outperforming more complex, recurrent architectures.
In text/plain format

Archived Files and Locations

application/pdf  495.7 kB
file_fl23fwwegndjtj5lnmpumhg6ay
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2017-11-02
Version   v1
Language   en ?
arXiv  1711.00894v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 1c3c4e47-f2d9-4b30-9c20-2bc2e1708907
API URL: JSON