Are NLP Models really able to Solve Simple Math Word Problems?
release_abeoldrapbdfpdyyhmzo6ryltm
by
Arkil Patel, Satwik Bhattamishra, Navin Goyal
2021
Abstract
The problem of designing NLP solvers for math word problems (MWP) has seen
sustained research activity and steady gains in the test accuracy. Since
existing solvers achieve high performance on the benchmark datasets for
elementary level MWPs containing one-unknown arithmetic word problems, such
problems are often considered "solved" with the bulk of research attention
moving to more complex MWPs. In this paper, we restrict our attention to
English MWPs taught in grades four and lower. We provide strong evidence that
the existing MWP solvers rely on shallow heuristics to achieve high performance
on the benchmark datasets. To this end, we show that MWP solvers that do not
have access to the question asked in the MWP can still solve a large fraction
of MWPs. Similarly, models that treat MWPs as bag-of-words can also achieve
surprisingly high accuracy. Further, we introduce a challenge dataset, SVAMP,
created by applying carefully chosen variations over examples sampled from
existing datasets. The best accuracy achieved by state-of-the-art models is
substantially lower on SVAMP, thus showing that much remains to be done even
for the simplest of the MWPs.
In text/plain
format
Archived Files and Locations
application/pdf 346.2 kB
file_pctl6ho6abb35bkdja4gcpfz6q
|
arxiv.org (repository) web.archive.org (webarchive) |
2103.07191v2
access all versions, variants, and formats of this works (eg, pre-prints)