Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kwon, Youngjoong, He, Yao, Choi, Heejung, Geng, Chen, Liu, Zhengmao, Wu, Jiajun, Adeli, Ehsan
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.28997
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912990030200832
author	Kwon, Youngjoong He, Yao Choi, Heejung Geng, Chen Liu, Zhengmao Wu, Jiajun Adeli, Ehsan
author_facet	Kwon, Youngjoong He, Yao Choi, Heejung Geng, Chen Liu, Zhengmao Wu, Jiajun Adeli, Ehsan
contents	We present a feed-forward human performance capture method that renders novel views of a performer from a monocular RGB stream. A key challenge in this setting is the lack of sufficient observations, especially for unseen regions. Assuming the subject moves continuously over time, we take advantage of the fact that more body parts become observable by maintaining a canonical space that is progressively updated with each incoming frame. This canonical space accumulates appearance information over time and serves as a context bank when direct observations are missing in the current live frame. To effectively utilize this context while respecting the deformation of the live state, we formulate the rendering process as probabilistic regression. This resolves conflicts between past and current observations, producing sharper reconstructions than deterministic regression approaches. Furthermore, it enables plausible synthesis even in regions with no prior observations. Experiments on in-domain (4D-Dress) and out-of-distribution (MVHumanNet) datasets demonstrate the effectiveness of our approach.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_28997
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	GenFusion: Feed-forward Human Performance Capture via Progressive Canonical Space Updates Kwon, Youngjoong He, Yao Choi, Heejung Geng, Chen Liu, Zhengmao Wu, Jiajun Adeli, Ehsan Computer Vision and Pattern Recognition We present a feed-forward human performance capture method that renders novel views of a performer from a monocular RGB stream. A key challenge in this setting is the lack of sufficient observations, especially for unseen regions. Assuming the subject moves continuously over time, we take advantage of the fact that more body parts become observable by maintaining a canonical space that is progressively updated with each incoming frame. This canonical space accumulates appearance information over time and serves as a context bank when direct observations are missing in the current live frame. To effectively utilize this context while respecting the deformation of the live state, we formulate the rendering process as probabilistic regression. This resolves conflicts between past and current observations, producing sharper reconstructions than deterministic regression approaches. Furthermore, it enables plausible synthesis even in regions with no prior observations. Experiments on in-domain (4D-Dress) and out-of-distribution (MVHumanNet) datasets demonstrate the effectiveness of our approach.
title	GenFusion: Feed-forward Human Performance Capture via Progressive Canonical Space Updates
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2603.28997

Similar Items