Saved in:
Bibliographic Details
Main Authors: Sheng, Xihua, Li, Li, Liu, Dong, Wang, Shiqi
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.08604
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916570812383232
author Sheng, Xihua
Li, Li
Liu, Dong
Wang, Shiqi
author_facet Sheng, Xihua
Li, Li
Liu, Dong
Wang, Shiqi
contents Deep video compression has made remarkable process in recent years, with the majority of advancements concentrated on P-frame coding. Although efforts to enhance B-frame coding are ongoing, their compression performance is still far behind that of traditional bi-directional video codecs. In this paper, we introduce a bi-directional deep contextual video compression scheme tailored for B-frames, termed DCVC-B, to improve the compression performance of deep B-frame coding. Our scheme mainly has three key innovations. First, we develop a bi-directional motion difference context propagation method for effective motion difference coding, which significantly reduces the bit cost of bi-directional motions. Second, we propose a bi-directional contextual compression model and a corresponding bi-directional temporal entropy model, to make better use of the multi-scale temporal contexts. Third, we propose a hierarchical quality structure-based training strategy, leading to an effective bit allocation across large groups of pictures (GOP). Experimental results show that our DCVC-B achieves an average reduction of 26.6% in BD-Rate compared to the reference software for H.265/HEVC under random access conditions. Remarkably, it surpasses the performance of the H.266/VVC reference software on certain test datasets under the same configuration. We anticipate our work can provide valuable insights and bring up deep B-frame coding to the next level.
format Preprint
id arxiv_https___arxiv_org_abs_2408_08604
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Bi-Directional Deep Contextual Video Compression
Sheng, Xihua
Li, Li
Liu, Dong
Wang, Shiqi
Computer Vision and Pattern Recognition
Deep video compression has made remarkable process in recent years, with the majority of advancements concentrated on P-frame coding. Although efforts to enhance B-frame coding are ongoing, their compression performance is still far behind that of traditional bi-directional video codecs. In this paper, we introduce a bi-directional deep contextual video compression scheme tailored for B-frames, termed DCVC-B, to improve the compression performance of deep B-frame coding. Our scheme mainly has three key innovations. First, we develop a bi-directional motion difference context propagation method for effective motion difference coding, which significantly reduces the bit cost of bi-directional motions. Second, we propose a bi-directional contextual compression model and a corresponding bi-directional temporal entropy model, to make better use of the multi-scale temporal contexts. Third, we propose a hierarchical quality structure-based training strategy, leading to an effective bit allocation across large groups of pictures (GOP). Experimental results show that our DCVC-B achieves an average reduction of 26.6% in BD-Rate compared to the reference software for H.265/HEVC under random access conditions. Remarkably, it surpasses the performance of the H.266/VVC reference software on certain test datasets under the same configuration. We anticipate our work can provide valuable insights and bring up deep B-frame coding to the next level.
title Bi-Directional Deep Contextual Video Compression
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2408.08604