Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Xiang, Donglai; Joo, Hanbyul; Sheikh, Yaser

Computer Science > Computer Vision and Pattern Recognition

arXiv:1812.01598 (cs)

[Submitted on 4 Dec 2018]

Title:Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Authors:Donglai Xiang, Hanbyul Joo, Yaser Sheikh

View PDF

Abstract:We present the first method to capture the 3D total motion of a target person from a monocular view input. Given an image or a monocular video, our method reconstructs the motion from body, face, and fingers represented by a 3D deformable mesh model. We use an efficient representation called 3D Part Orientation Fields (POFs), to encode the 3D orientations of all body parts in the common 2D image space. POFs are predicted by a Fully Convolutional Network (FCN), along with the joint confidence maps. To train our network, we collect a new 3D human motion dataset capturing diverse total body motion of 40 subjects in a multiview system. We leverage a 3D deformable human model to reconstruct total body pose from the CNN outputs by exploiting the pose and shape prior in the model. We also present a texture-based tracking method to obtain temporally coherent motion capture output. We perform thorough quantitative evaluations including comparison with the existing body-specific and hand-specific methods, and performance analysis on camera viewpoint and human pose changes. Finally, we demonstrate the results of our total body motion capture on various challenging in-the-wild videos. Our code and newly collected human motion dataset will be publicly shared.

Comments:	17 pages, 16 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:1812.01598 [cs.CV]
	(or arXiv:1812.01598v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1812.01598

Submission history

From: Donglai Xiang [view email]
[v1] Tue, 4 Dec 2018 18:55:33 UTC (6,523 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Monocular Total Capture: Posing Face, Body, and Hands in the Wild

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators