EagleEye: Attack-Agnostic Defense against Adversarial Inputs (Technical
Report)
release_2ewvoa5yrjbczohmrqquitogfe
by
Yujie Ji, Xinyang Zhang, Ting Wang
2018
Abstract
Deep neural networks (DNNs) are inherently vulnerable to adversarial inputs:
such maliciously crafted samples trigger DNNs to misbehave, leading to
detrimental consequences for DNN-powered systems. The fundamental challenges of
mitigating adversarial inputs stem from their adaptive and variable nature.
Existing solutions attempt to improve DNN resilience against specific attacks;
yet, such static defenses can often be circumvented by adaptively engineered
inputs or by new attack variants.
Here, we present EagleEye, an attack-agnostic adversarial tampering analysis
engine for DNN-powered systems. Our design exploits the minimality
principle underlying many attacks: to maximize the attack's evasiveness, the
adversary often seeks the minimum possible distortion to convert genuine inputs
to adversarial ones. We show that this practice entails the distinct
distributional properties of adversarial inputs in the input space. By
leveraging such properties in a principled manner, EagleEye effectively
discriminates adversarial inputs and even uncovers their correct classification
outputs. Through extensive empirical evaluation using a range of benchmark
datasets and DNN models, we validate EagleEye's efficacy. We further
investigate the adversary's possible countermeasures, which implies a difficult
dilemma for her: to evade EagleEye's detection, excessive distortion is
necessary, thereby significantly reducing the attack's evasiveness regarding
other detection mechanisms.
In text/plain
format
Archived Files and Locations
application/pdf 543.8 kB
file_bfsetewflzhtxhpojiimlakbni
|
arxiv.org (repository) web.archive.org (webarchive) |
1808.00123v1
access all versions, variants, and formats of this works (eg, pre-prints)