MARC表示: :: Library Catalog

保存先:

書誌詳細
主要な著者:	Choi, Jun Myeong, Yoon, Jae Shin, Qi, Luchao, Sengupta, Roni, Lee, Joon-Young
フォーマット:	Preprint
出版事項:	2026
主題:	Computer Vision and Pattern Recognition
オンライン･アクセス:	https://arxiv.org/abs/2605.28811
タグ:	タグ追加タグなし, このレコードへの初めてのタグを付けませんか!

_version_	1866916056398823424
author	Choi, Jun Myeong Yoon, Jae Shin Qi, Luchao Sengupta, Roni Lee, Joon-Young
author_facet	Choi, Jun Myeong Yoon, Jae Shin Qi, Luchao Sengupta, Roni Lee, Joon-Young
contents	We present a method for harmonizing the lighting of a foreground video to match a target background scene, adjusting shadows, color tone, and illumination intensity (relightful harmonization). Unlike images, acquiring labeled data for videos, where identical motions are recorded under different lighting conditions, is practically infeasible and non-scalable. While one way to create such paired data is to apply existing image-based harmonization models frame by frame to a video, the resulting outputs often suffer from significant temporal jitters. We overcome this problem by introducing a novel lighting deflickering model that can stabilize the global and local lighting flickering artifacts. Our video diffusion model learns from these upgraded deflickered data with a volume of real and synthetic videos to generate high-quality video harmonization results. We further propose an asymmetric alpha mask conditioning technique to learn the clean boundaries from real videos. Experiments demonstrate that our model achieves strong temporal coherence, naturalness, cleaner boundaries, and physically meaningful lighting behavior, while maintaining strong relighting expressiveness compared to prior image-based and video-based harmonization methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_28811
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	HarmoVid: Relightful Video Portrait Harmonization Choi, Jun Myeong Yoon, Jae Shin Qi, Luchao Sengupta, Roni Lee, Joon-Young Computer Vision and Pattern Recognition We present a method for harmonizing the lighting of a foreground video to match a target background scene, adjusting shadows, color tone, and illumination intensity (relightful harmonization). Unlike images, acquiring labeled data for videos, where identical motions are recorded under different lighting conditions, is practically infeasible and non-scalable. While one way to create such paired data is to apply existing image-based harmonization models frame by frame to a video, the resulting outputs often suffer from significant temporal jitters. We overcome this problem by introducing a novel lighting deflickering model that can stabilize the global and local lighting flickering artifacts. Our video diffusion model learns from these upgraded deflickered data with a volume of real and synthetic videos to generate high-quality video harmonization results. We further propose an asymmetric alpha mask conditioning technique to learn the clean boundaries from real videos. Experiments demonstrate that our model achieves strong temporal coherence, naturalness, cleaner boundaries, and physically meaningful lighting behavior, while maintaining strong relighting expressiveness compared to prior image-based and video-based harmonization methods.
title	HarmoVid: Relightful Video Portrait Harmonization
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2605.28811

類似資料