Cascaded deep monocular 3D human pose estimation with evolutionary training data
release_oluo4zzxlrfcviisjgczemuxna
by
Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng
2020
Abstract
End-to-end deep representation learning has achieved remarkable accuracy for
monocular 3D human pose estimation, yet these models may fail for unseen poses
with limited and fixed training data. This paper proposes a novel data
augmentation method that: (1) is scalable for synthesizing massive amount of
training data (over 8 million valid 3D human poses with corresponding 2D
projections) for training 2D-to-3D networks, (2) can effectively reduce dataset
bias. Our method evolves a limited dataset to synthesize unseen 3D human
skeletons based on a hierarchical human representation and heuristics inspired
by prior knowledge. Extensive experiments show that our approach not only
achieves state-of-the-art accuracy on the largest public benchmark, but also
generalizes significantly better to unseen and rare poses. Relevant files and
tools are available at the project website.
In text/plain
format
Archived Files and Locations
application/pdf 8.4 MB
file_5buaeyuchvd47ayzn4tshljeby
|
arxiv.org (repository) web.archive.org (webarchive) |
2006.07778v1
access all versions, variants, and formats of this works (eg, pre-prints)