Identifying Layers Susceptible to Adversarial Attacks
release_7za7c7ervzf55oqasuuqjvs5ri
by
Shoaib Ahmed Siddiqui, Thomas Breuel
2021
Abstract
In this paper, we investigate the use of pretraining with adversarial
networks, with the objective of discovering the relationship between network
depth and robustness. For this purpose, we selectively retrain different
portions of VGG and ResNet architectures on CIFAR-10, Imagenette, and ImageNet
using non-adversarial and adversarial data. Experimental results show that
susceptibility to adversarial samples is associated with low-level feature
extraction layers. Therefore, retraining of high-level layers is insufficient
for achieving robustness. Furthermore, adversarial attacks yield outputs from
early layers that differ statistically from features for non-adversarial
samples and do not permit consistent classification by subsequent layers. This
supports common hypotheses regarding the association of robustness with the
feature extractor, insufficiency of deeper layers in providing robustness, and
large differences in adversarial and non-adversarial feature vectors.
In text/plain
format
Archived Files and Locations
application/pdf 15.0 MB
file_s3fcysar4fftxfks3mnqvm3wry
|
arxiv.org (repository) web.archive.org (webarchive) |
2107.04827v2
access all versions, variants, and formats of this works (eg, pre-prints)