Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sheng, Xihua, Li, Li, Liu, Dong, Wang, Shiqi
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.08604
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916570812383232
author	Sheng, Xihua Li, Li Liu, Dong Wang, Shiqi
author_facet	Sheng, Xihua Li, Li Liu, Dong Wang, Shiqi
contents	Deep video compression has made remarkable process in recent years, with the majority of advancements concentrated on P-frame coding. Although efforts to enhance B-frame coding are ongoing, their compression performance is still far behind that of traditional bi-directional video codecs. In this paper, we introduce a bi-directional deep contextual video compression scheme tailored for B-frames, termed DCVC-B, to improve the compression performance of deep B-frame coding. Our scheme mainly has three key innovations. First, we develop a bi-directional motion difference context propagation method for effective motion difference coding, which significantly reduces the bit cost of bi-directional motions. Second, we propose a bi-directional contextual compression model and a corresponding bi-directional temporal entropy model, to make better use of the multi-scale temporal contexts. Third, we propose a hierarchical quality structure-based training strategy, leading to an effective bit allocation across large groups of pictures (GOP). Experimental results show that our DCVC-B achieves an average reduction of 26.6% in BD-Rate compared to the reference software for H.265/HEVC under random access conditions. Remarkably, it surpasses the performance of the H.266/VVC reference software on certain test datasets under the same configuration. We anticipate our work can provide valuable insights and bring up deep B-frame coding to the next level.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_08604
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Bi-Directional Deep Contextual Video Compression Sheng, Xihua Li, Li Liu, Dong Wang, Shiqi Computer Vision and Pattern Recognition Deep video compression has made remarkable process in recent years, with the majority of advancements concentrated on P-frame coding. Although efforts to enhance B-frame coding are ongoing, their compression performance is still far behind that of traditional bi-directional video codecs. In this paper, we introduce a bi-directional deep contextual video compression scheme tailored for B-frames, termed DCVC-B, to improve the compression performance of deep B-frame coding. Our scheme mainly has three key innovations. First, we develop a bi-directional motion difference context propagation method for effective motion difference coding, which significantly reduces the bit cost of bi-directional motions. Second, we propose a bi-directional contextual compression model and a corresponding bi-directional temporal entropy model, to make better use of the multi-scale temporal contexts. Third, we propose a hierarchical quality structure-based training strategy, leading to an effective bit allocation across large groups of pictures (GOP). Experimental results show that our DCVC-B achieves an average reduction of 26.6% in BD-Rate compared to the reference software for H.265/HEVC under random access conditions. Remarkably, it surpasses the performance of the H.266/VVC reference software on certain test datasets under the same configuration. We anticipate our work can provide valuable insights and bring up deep B-frame coding to the next level.
title	Bi-Directional Deep Contextual Video Compression
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2408.08604

Similar Items