Numerical Coordinate Regression with Convolutional Neural Networks release_rysa5y745ze3lhwajbdtpl2m3e

by Aiden Nibali, Zhen He, Stuart Morgan, Luke Prendergast

Released as a article .

2018  

Abstract

We study deep learning approaches to inferring numerical coordinates for points of interest in an input image. Existing convolutional neural network-based solutions to this problem either take a heatmap matching approach or regress to coordinates with a fully connected output layer. Neither of these approaches is ideal, since the former is not entirely differentiable, and the latter lacks inherent spatial generalization. We propose our differentiable spatial to numerical transform (DSNT) to fill this gap. The DSNT layer adds no trainable parameters, is fully differentiable, and exhibits good spatial generalization. Unlike heatmap matching, DSNT works well with low heatmap resolutions, so it can be dropped in as an output layer for a wide range of existing fully convolutional architectures. Consequently, DSNT offers a better trade-off between inference speed and prediction accuracy compared to existing techniques. When used to replace the popular heatmap matching approach used in almost all state-of-the-art methods for pose estimation, DSNT gives better prediction accuracy for all model architectures tested.
In text/plain format

Archived Files and Locations

application/pdf  2.7 MB
file_mgz7ixjrhzcwdmopgyaw44a63a
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2018-01-23
Version   v1
Language   en ?
arXiv  1801.07372v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 48da9e80-3d4d-4577-81ce-31e491cd67e6
API URL: JSON