Generating Novel Scene Compositions from Single Images and Videos
release_rkkjps4d6jaybn45hnebuu3k5e
by
Vadim Sushko, Dan Zhang, Juergen Gall, Anna Khoreva
2022
Abstract
Given a large dataset for training, GANs can achieve remarkable performance
for the image synthesis task. However, training GANs in extremely low data
regimes remains a challenge, as overfitting often occurs, leading to
memorization or training divergence. In this work, we introduce SIV-GAN, an
unconditional generative model that can generate new scene compositions from a
single training image or a single video clip. We propose a two-branch
discriminator architecture, with content and layout branches designed to judge
internal content and scene layout realism separately from each other. This
discriminator design enables synthesis of visually plausible, novel
compositions of a scene, with varying content and layout, while preserving the
context of the original sample. Compared to previous single-image GANs, our
model generates more diverse, higher quality images, while not being restricted
to a single image setting. We show that SIV-GAN successfully deals with a new
challenging task of learning from a single video, for which prior GAN models
fail to achieve synthesis of both high quality and diversity.
In text/plain
format
Archived Files and Locations
application/pdf 8.4 MB
file_be3l2xsxxbgspnfbxer3i7gcke
|
arxiv.org (repository) web.archive.org (webarchive) |
2103.13389v3
access all versions, variants, and formats of this works (eg, pre-prints)