Saved in:
Bibliographic Details
Main Authors: Eteke, Cem, Tosun, Batuhan, Piccolrovazzi, Martin, Griessel, Alexander, Kellerer, Wolfgang, Steinbach, Eckehard
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.13837
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915990346924032
author Eteke, Cem
Tosun, Batuhan
Piccolrovazzi, Martin
Griessel, Alexander
Kellerer, Wolfgang
Steinbach, Eckehard
author_facet Eteke, Cem
Tosun, Batuhan
Piccolrovazzi, Martin
Griessel, Alexander
Kellerer, Wolfgang
Steinbach, Eckehard
contents We study video reconstruction from ultra-low-bitrate representations, where the primary challenge shifts from encoding to decoding. In this regime, reconstruction with classical and neural codecs introduces blur, while generative and semantic approaches often struggle to jointly preserve fidelity, temporal consistency, and perceptual quality. To address these limitations, we propose a causal video diffusion model that reconstructs videos from ultra-low-bitrate semantics and highly compressed frames by jointly modeling their complementary information. We further introduce temporal-only distillation from a bidirectional teacher to enable parameter-efficient training and causal few-step inference. Through extensive quantitative, qualitative, and subjective evaluation, we show that the proposed method outperforms classical, neural, generative, and semantic baselines in ultra-low-bitrate video reconstruction.
format Preprint
id arxiv_https___arxiv_org_abs_2602_13837
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A Causal Diffusion Model for Video Reconstruction from Ultra-Low-Bitrate Representations
Eteke, Cem
Tosun, Batuhan
Piccolrovazzi, Martin
Griessel, Alexander
Kellerer, Wolfgang
Steinbach, Eckehard
Computer Vision and Pattern Recognition
We study video reconstruction from ultra-low-bitrate representations, where the primary challenge shifts from encoding to decoding. In this regime, reconstruction with classical and neural codecs introduces blur, while generative and semantic approaches often struggle to jointly preserve fidelity, temporal consistency, and perceptual quality. To address these limitations, we propose a causal video diffusion model that reconstructs videos from ultra-low-bitrate semantics and highly compressed frames by jointly modeling their complementary information. We further introduce temporal-only distillation from a bidirectional teacher to enable parameter-efficient training and causal few-step inference. Through extensive quantitative, qualitative, and subjective evaluation, we show that the proposed method outperforms classical, neural, generative, and semantic baselines in ultra-low-bitrate video reconstruction.
title A Causal Diffusion Model for Video Reconstruction from Ultra-Low-Bitrate Representations
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2602.13837