Numerical Coordinate Regression with Convolutional Neural Networks
release_rysa5y745ze3lhwajbdtpl2m3e
by
Aiden Nibali, Zhen He, Stuart Morgan, Luke Prendergast
2018
Abstract
We study deep learning approaches to inferring numerical coordinates for
points of interest in an input image. Existing convolutional neural
network-based solutions to this problem either take a heatmap matching approach
or regress to coordinates with a fully connected output layer. Neither of these
approaches is ideal, since the former is not entirely differentiable, and the
latter lacks inherent spatial generalization. We propose our differentiable
spatial to numerical transform (DSNT) to fill this gap. The DSNT layer adds no
trainable parameters, is fully differentiable, and exhibits good spatial
generalization. Unlike heatmap matching, DSNT works well with low heatmap
resolutions, so it can be dropped in as an output layer for a wide range of
existing fully convolutional architectures. Consequently, DSNT offers a better
trade-off between inference speed and prediction accuracy compared to existing
techniques. When used to replace the popular heatmap matching approach used in
almost all state-of-the-art methods for pose estimation, DSNT gives better
prediction accuracy for all model architectures tested.
In text/plain
format
Archived Files and Locations
application/pdf 2.7 MB
file_mgz7ixjrhzcwdmopgyaw44a63a
|
arxiv.org (repository) web.archive.org (webarchive) |
1801.07372v1
access all versions, variants, and formats of this works (eg, pre-prints)