Modelling Adversarial Noise for Adversarial Defense
release_6yb2t5mcinab7airvby4ajq674
by
Dawei Zhou, Nannan Wang, Tongliang Liu, Bo Han
2021
Abstract
Deep neural networks have been demonstrated to be vulnerable to adversarial
noise, promoting the development of defenses against adversarial attacks.
Traditionally, adversarial defenses typically focus on directly exploiting
adversarial examples to remove adversarial noise or train an adversarially
robust target model. Motivated by that the relationship between adversarial
data and natural data can help infer clean data from adversarial data to obtain
the final correct prediction, in this paper, we study to model adversarial
noise to learn the transition relationship in the label space for using
adversarial labels to improve adversarial accuracy. Specifically, we introduce
a transition matrix to relate adversarial labels and true labels. By exploiting
the transition matrix, we can directly infer clean labels from adversarial
labels. Then, we propose to employ a deep neural network (i.e., transition
network) to model the instance-dependent transition matrix from adversarial
noise. In addition, we conduct joint adversarial training on the target model
and the transition network to achieve optimal performance. Empirical
evaluations on benchmark datasets demonstrate that our method could
significantly improve adversarial accuracy in comparison to state-of-the-art
methods.
In text/plain
format
Archived Files and Locations
application/pdf 741.6 kB
file_rqlp2kegmjdbthlon6olsotgpm
|
arxiv.org (repository) web.archive.org (webarchive) |
2109.09901v1
access all versions, variants, and formats of this works (eg, pre-prints)