Saved in:
Bibliographic Details
Main Authors: Körber, Nikolai, Kromer, Eduard, Siebert, Andreas, Hauke, Sascha, Mueller-Gritschneder, Daniel, Schuller, Björn
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.09368
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909535741935616
author Körber, Nikolai
Kromer, Eduard
Siebert, Andreas
Hauke, Sascha
Mueller-Gritschneder, Daniel
Schuller, Björn
author_facet Körber, Nikolai
Kromer, Eduard
Siebert, Andreas
Hauke, Sascha
Mueller-Gritschneder, Daniel
Schuller, Björn
contents We introduce PerCoV2, a novel and open ultra-low bit-rate perceptual image compression system designed for bandwidth- and storage-constrained applications. Building upon prior work by Careil et al., PerCoV2 extends the original formulation to the Stable Diffusion 3 ecosystem and enhances entropy coding efficiency by explicitly modeling the discrete hyper-latent image distribution. To this end, we conduct a comprehensive comparison of recent autoregressive methods (VAR and MaskGIT) for entropy modeling and evaluate our approach on the large-scale MSCOCO-30k benchmark. Compared to previous work, PerCoV2 (i) achieves higher image fidelity at even lower bit-rates while maintaining competitive perceptual quality, (ii) features a hybrid generation mode for further bit-rate savings, and (iii) is built solely on public components. Code and trained models will be released at https://github.com/Nikolai10/PerCoV2.
format Preprint
id arxiv_https___arxiv_org_abs_2503_09368
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
Körber, Nikolai
Kromer, Eduard
Siebert, Andreas
Hauke, Sascha
Mueller-Gritschneder, Daniel
Schuller, Björn
Computer Vision and Pattern Recognition
Image and Video Processing
We introduce PerCoV2, a novel and open ultra-low bit-rate perceptual image compression system designed for bandwidth- and storage-constrained applications. Building upon prior work by Careil et al., PerCoV2 extends the original formulation to the Stable Diffusion 3 ecosystem and enhances entropy coding efficiency by explicitly modeling the discrete hyper-latent image distribution. To this end, we conduct a comprehensive comparison of recent autoregressive methods (VAR and MaskGIT) for entropy modeling and evaluate our approach on the large-scale MSCOCO-30k benchmark. Compared to previous work, PerCoV2 (i) achieves higher image fidelity at even lower bit-rates while maintaining competitive perceptual quality, (ii) features a hybrid generation mode for further bit-rate savings, and (iii) is built solely on public components. Code and trained models will be released at https://github.com/Nikolai10/PerCoV2.
title PerCoV2: Improved Ultra-Low Bit-Rate Perceptual Image Compression with Implicit Hierarchical Masked Image Modeling
topic Computer Vision and Pattern Recognition
Image and Video Processing
url https://arxiv.org/abs/2503.09368