Challenges in Generalization in Open Domain Question Answering release_5vdc7qs3ffhbxcsjqnkjhad67m

by Linqing Liu, Patrick Lewis, Sebastian Riedel, Pontus Stenetorp

Released as a article .

2021  

Abstract

Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is unclear which aspects of novel questions make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (comp-gen), and novel-entity generalization (novel-entity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and TriviaQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4% and 9.6/1.5% lower compared to that for the full test set -- indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities relatively well, they struggle with those requiring compositional generalization. Lastly, we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.
In text/plain format

Archived Content

There are no accessible files associated with this release. You could check other releases for this work for an accessible version.

"Dark" Preservation Only
Save Paper Now!

Know of a fulltext copy of on the public web? Submit a URL and we will archive it

Type  article
Stage   submitted
Date   2021-12-15
Version   v2
Language   en ?
arXiv  2109.01156v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: f6c83d7b-fde5-4666-b73b-1ad0fb4b2adf
API URL: JSON