Learning Universal Adversarial Perturbations with Generative Models
release_sb4e7l3kxrgqdi7nrgkhmesfiu
by
Jamie Hayes, George Danezis
2017
Abstract
Neural networks are known to be vulnerable to adversarial examples, inputs
that have been intentionally perturbed to remain visually similar to the source
input, but cause a misclassification. It was recently shown that given a
dataset and classifier, there exists so called universal adversarial
perturbations, a single perturbation that causes a misclassification when
applied to any input. In this work, we introduce universal adversarial
networks, a generative network that is capable of fooling a target classifier
when it's generated output is added to a clean sample from a dataset. We show
that this technique improves on known universal adversarial attacks.
In text/plain
format
Archived Files and Locations
application/pdf 1.7 MB
file_nr2jrcscszgjnetizub6ytubw4
|
arxiv.org (repository) web.archive.org (webarchive) |
1708.05207v1
access all versions, variants, and formats of this works (eg, pre-prints)