Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
release_oadjpimlvfgvljr737yxh2oijy
by
Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie
2021
Abstract
The rapid progress in 3D scene understanding has come with growing demand for
data; however, collecting and annotating 3D scenes (e.g. point clouds) are
notoriously hard. For example, the number of scenes (e.g. indoor rooms) that
can be accessed and scanned might be limited; even given sufficient data,
acquiring 3D labels (e.g. instance masks) requires intensive human labor. In
this paper, we explore data-efficient learning for 3D point cloud. As a first
step towards this direction, we propose Contrastive Scene Contexts, a 3D
pre-training method that makes use of both point-level correspondences and
spatial contexts in a scene. Our method achieves state-of-the-art results on a
suite of benchmarks where training data or labels are scarce. Our study reveals
that exhaustive labelling of 3D point clouds might be unnecessary; and
remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89%
(instance segmentation) and 96% (semantic segmentation) of the baseline
performance that uses full annotations.
In text/plain
format
Archived Files and Locations
application/pdf 5.2 MB
file_4bgcw3wqibbyro2snke62u2w7u
|
arxiv.org (repository) web.archive.org (webarchive) |
2012.09165v3
access all versions, variants, and formats of this works (eg, pre-prints)