Evaluating NLP Systems On a Novel Cloze Task: Judging the Plausibility of Possible Fillers in Instructional Texts
release_t6vxakipz5bllc5z5eoqeknbke
by
Zizhao Hu, Ravikiran Chanumolu, Xingyu Lin, Nayela Ayaz, Vincent Chi
2021
Abstract
Cloze task is a widely used task to evaluate an NLP system's language
understanding ability. However, most of the existing cloze tasks only require
NLP systems to give the relative best prediction for each input data sample,
rather than the absolute quality of all possible predictions, in a consistent
way across the input domain. Thus a new task is proposed: predicting if a
filler word in a cloze task is a good, neutral, or bad candidate. Complicated
versions can be extended to predicting more discrete classes or continuous
scores. We focus on subtask A in Semeval 2022 task 7, explored some possible
architectures to solve this new task, provided a detailed comparison of them,
and proposed an ensemble method to improve traditional models in this new task.
In text/plain
format
Archived Files and Locations
application/pdf 268.7 kB
file_djy3u5n3qzarbo473ir3uvhosm
|
arxiv.org (repository) web.archive.org (webarchive) |
2112.01867v1
access all versions, variants, and formats of this works (eg, pre-prints)