Saved in:
Bibliographic Details
Main Authors: Chen, Bolin, Ye, Yan, Chen, Jie, Liao, Ru-Ling, Yin, Shanzhi, Wang, Shiqi, Yang, Kaifa, Li, Yue, Xu, Yiling, Wang, Ye-Kui, Gehlot, Shiv, Su, Guan-Ming, Yin, Peng, McCarthy, Sean, Sullivan, Gary J.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.15105
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This paper proposes a Generative Face Video Compression (GFVC) approach using Supplemental Enhancement Information (SEI), where a series of compact spatial and temporal representations of a face video signal (e.g., 2D/3D keypoints, facial semantics and compact features) can be coded using SEI messages and inserted into the coded video bitstream. At the time of writing, the proposed GFVC approach using SEI messages has been included into a draft amendment of the Versatile Supplemental Enhancement Information (VSEI) standard by the Joint Video Experts Team (JVET) of ISO/IEC JTC 1/SC 29 and ITU-T SG21, which will be standardized as a new version of ITU-T H.274 | ISO/IEC 23002-7. To the best of the authors' knowledge, the JVET work on the proposed SEI-based GFVC approach is the first standardization activity for generative video compression. The proposed SEI approach has not only advanced the reconstruction quality of early-day Model-Based Coding (MBC) via the state-of-the-art generative technique, but also established a new SEI definition for future GFVC applications and deployment. Experimental results illustrate that the proposed SEI-based GFVC approach can achieve remarkable rate-distortion performance compared with the latest Versatile Video Coding (VVC) standard, whilst also potentially enabling a wide variety of functionalities including user-specified animation/filtering and metaverse-related applications.