Admix: Enhancing the Transferability of Adversarial Attacks
release_tugllri33zhadeihodmamvumaq
by
Xiaosen Wang, Xuanran He, Jingdong Wang, Kun He
2021
Abstract
Deep neural networks are known to be extremely vulnerable to adversarial
examples under white-box setting. Moreover, the malicious adversaries crafted
on the surrogate (source) model often exhibit black-box transferability on
other models with the same learning task but having different architectures.
Recently, various methods have been proposed to boost the adversarial
transferability, among which the input transformation is one of the most
effective approaches. We investigate in this direction and observe that
existing transformations are all applied on a single image, which might limit
the adversarial transferability. To this end, we propose a new input
transformation based attack method called Admix that considers the input image
and a set of images randomly sampled from other categories. Instead of directly
calculating the gradient on the original input, Admix calculates the gradient
on the input image admixed with a small portion of each add-in image while
using the original label of the input, to craft more transferable adversaries.
Empirical evaluations on standard ImageNet dataset demonstrate that Admix could
achieve significantly better transferability than existing input transformation
methods under both single model setting and ensemble-model setting. By
incorporating with existing input transformations, our method could further
improve the transferability and outperforms the state-of-the-art combination of
input transformations by a clear margin when attacking nine advanced defense
models under ensemble-model setting.
In text/plain
format
Archived Files and Locations
application/pdf 515.4 kB
file_qz7gpkfaozdbfi2yb2qsmstkr4
|
arxiv.org (repository) web.archive.org (webarchive) |
2102.00436v2
access all versions, variants, and formats of this works (eg, pre-prints)