TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement
release_qwte6qcbqfgqxpoknensy7bmuq
by
Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-Moll
2022
Abstract
We present TOCH, a method for refining incorrect 3D hand-object interaction
sequences using a data prior. Existing hand trackers, especially those that
rely on very few cameras, often produce visually unrealistic results with
hand-object intersection or missing contacts. Although correcting such errors
requires reasoning about temporal aspects of interaction, most previous works
focus on static grasps and contacts. The core of our method are TOCH fields, a
novel spatio-temporal representation for modeling correspondences between hands
and objects during interaction. TOCH fields are a point-wise, object-centric
representation, which encode the hand position relative to the object.
Leveraging this novel representation, we learn a latent manifold of plausible
TOCH fields with a temporal denoising auto-encoder. Experiments demonstrate
that TOCH outperforms state-of-the-art 3D hand-object interaction models, which
are limited to static grasps and contacts. More importantly, our method
produces smooth interactions even before and after contact. Using a single
trained TOCH model, we quantitatively and qualitatively demonstrate its
usefulness for correcting erroneous sequences from off-the-shelf RGB/RGB-D
hand-object reconstruction methods and transferring grasps across objects.
In text/plain
format
Archived Files and Locations
application/pdf 10.2 MB
file_huj6em3zfvcblaizr2s4rlniae
|
arxiv.org (repository) web.archive.org (webarchive) |
2205.07982v2
access all versions, variants, and formats of this works (eg, pre-prints)