Semantic Implicit Neural Scene Representations With Semi-Supervised Training
release_pcijezws6vhznmz4ih5buuytva
by
Amit Kohli, Vincent Sitzmann, Gordon Wetzstein
2021
Abstract
The recent success of implicit neural scene representations has presented a
viable new method for how we capture and store 3D scenes. Unlike conventional
3D representations, such as point clouds, which explicitly store scene
properties in discrete, localized units, these implicit representations encode
a scene in the weights of a neural network which can be queried at any
coordinate to produce these same scene properties. Thus far, implicit
representations have primarily been optimized to estimate only the appearance
and/or 3D geometry information in a scene. We take the next step and
demonstrate that an existing implicit representation (SRNs) is actually
multi-modal; it can be further leveraged to perform per-point semantic
segmentation while retaining its ability to represent appearance and geometry.
To achieve this multi-modal behavior, we utilize a semi-supervised learning
strategy atop the existing pre-trained scene representation. Our method is
simple, general, and only requires a few tens of labeled 2D segmentation masks
in order to achieve dense 3D semantic segmentation. We explore two novel
applications for this semantically aware implicit neural scene representation:
3D novel view and semantic label synthesis given only a single input RGB image
or 2D label mask, as well as 3D interpolation of appearance and semantics.
In text/plain
format
Archived Files and Locations
application/pdf 4.7 MB
file_4l2x6ksumbb7hhxx7awfh4bqii
|
arxiv.org (repository) web.archive.org (webarchive) |
2003.12673v2
access all versions, variants, and formats of this works (eg, pre-prints)