Revisiting Adversarial Risk
release_y242xyi4uzb2devuhbl4b52msi
by
Arun Sai Suggala, Adarsh Prasad, Vaishnavh Nagarajan, Pradeep
Ravikumar
2018
Abstract
Recent works on adversarial perturbations show that there is an inherent
trade-off between standard test accuracy and adversarial accuracy.
Specifically, they show that no classifier can simultaneously be robust to
adversarial perturbations and achieve high standard test accuracy. However,
this is contrary to the standard notion that on tasks such as image
classification, humans are robust classifiers with low error rate. In this
work, we show that the main reason behind this confusion is the inexact
definition of adversarial perturbation that is used in the literature. To fix
this issue, we propose a slight, yet important modification to the existing
definition of adversarial perturbation. Based on the modified definition, we
show that there is no trade-off between adversarial and standard accuracies;
there exist classifiers that are robust and achieve high standard accuracy. We
further study several properties of this new definition of adversarial risk and
its relation to the existing definition.
In text/plain
format
Archived Files and Locations
application/pdf 435.7 kB
file_3scdqj47ajcxbkoobxhgbkgr4e
|
arxiv.org (repository) web.archive.org (webarchive) |
1806.02924v2
access all versions, variants, and formats of this works (eg, pre-prints)