LOGAN: Membership Inference Attacks Against Generative Models
release_cty22lblebb2riuecz2s62235u
by
Jamie Hayes, Luca Melis, George Danezis, Emiliano De Cristofaro
2017
Abstract
Generative models estimate the underlying distribution of a dataset to
generate realistic samples according to that distribution. In this paper, we
present the first membership inference attacks against generative models: given
a data point, the adversary determines whether or not it was used to train the
model. Our attacks leverage Generative Adversarial Networks (GANs), which
combine a discriminative and a generative model, to detect overfitting and
recognize inputs that were part of training datasets, using the discriminator's
capacity to learn statistical differences in distributions.
We present attacks based on both white-box and black-box access to the target
model, against several state-of-the-art generative models, over datasets of
complex representations of faces (LFW), objects (CIFAR-10), and medical images
(Diabetic Retinopathy). We also discuss the sensitivity of the attacks to
different training parameters, and their robustness against mitigation
strategies, finding that defenses are either ineffective or lead to
significantly worse performances of the generative models in terms of training
stability and/or sample quality.
In text/plain
format
Archived Files and Locations
application/pdf 4.9 MB
file_dnjil5qp7zctxbrcgcmzik5fqa
|
arxiv.org (repository) web.archive.org (webarchive) |
1705.07663v2
access all versions, variants, and formats of this works (eg, pre-prints)