:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Ding, Xin, Cao, Shijie, Cao, Ting, Chen, Zhibo
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2501.06218
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Q&C: When Quantization Meets Cache in Efficient Image Generation
di: Ding, Xin, et al.
Pubblicazione: (2025)

Quantized Prompt for Efficient Generalization of Vision-Language Models
di: Hao, Tianxiang, et al.
Pubblicazione: (2024)

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
di: Sui, Yang, et al.
Pubblicazione: (2024)

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
di: Ding, Xin, et al.
Pubblicazione: (2025)

Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers
di: Ding, Rui, et al.
Pubblicazione: (2024)

Exploring Scalable Unified Modeling for General Low-Level Vision
di: Chen, Xiangyu, et al.
Pubblicazione: (2025)

Breaking Modality Heterogeneity in Low-Bit Quantization for Large Vision-Language Models
di: Zhong, Yi, et al.
Pubblicazione: (2026)

Diff-ICMH: Harmonizing Machine and Human Vision in Image Compression with Generative Prior
di: Feng, Ruoyu, et al.
Pubblicazione: (2025)

MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs
di: Ren, Yulin, et al.
Pubblicazione: (2024)

Boosting the Generalization and Reasoning of Vision Language Models with Curriculum Reinforcement Learning
di: Deng, Huilin, et al.
Pubblicazione: (2025)

Learning Visual Grounding from Generative Vision and Language Model
di: Wang, Shijie, et al.
Pubblicazione: (2024)

6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models
di: Su, Rundong, et al.
Pubblicazione: (2026)

BitDance: Scaling Autoregressive Generative Models with Binary Tokens
di: Ai, Yuang, et al.
Pubblicazione: (2026)

Autoregressive Image Generation with Masked Bit Modeling
di: Yu, Qihang, et al.
Pubblicazione: (2026)

Hyper Adversarial Tuning for Boosting Adversarial Robustness of Pretrained Large Vision Models
di: Lv, Kangtao, et al.
Pubblicazione: (2024)

A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging
di: Cao, Miao, et al.
Pubblicazione: (2024)

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models
di: Feng, Weilun, et al.
Pubblicazione: (2024)

Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields
di: Gao, Yixin, et al.
Pubblicazione: (2025)

JAQ: Joint Efficient Architecture Design and Low-Bit Quantization with Hardware-Software Co-Exploration
di: Wang, Mingzi, et al.
Pubblicazione: (2025)

HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
di: Mao, Shizhuo, et al.
Pubblicazione: (2025)

Training-free Camera Control for Video Generation
di: Hou, Chen, et al.
Pubblicazione: (2024)

LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing
di: Cao, Yuanming, et al.
Pubblicazione: (2026)

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models
di: He, Yefei, et al.
Pubblicazione: (2023)

Towards Joint Quantization and Token Pruning of Vision-Language Models
di: Li, Xinqing, et al.
Pubblicazione: (2026)

Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
di: Wei, Xiyuan, et al.
Pubblicazione: (2025)

LL-Bench: Rethinking Low-Level Vision Evaluation in the Era of Large-Scale Generative Models
di: Liu, Lu, et al.
Pubblicazione: (2026)

CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs
di: Han, Insu, et al.
Pubblicazione: (2025)

CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-Resolution
di: Liu, Kai, et al.
Pubblicazione: (2025)

Outlier-Aware Training for Low-Bit Quantization of Structural Re-Parameterized Networks
di: Niu, Muqun, et al.
Pubblicazione: (2024)

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
di: Tao, Keda, et al.
Pubblicazione: (2025)

GRIP-VLM: Group-Relative Importance Pruning for Efficient Vision-Language Models
di: Huang, Mingzhe, et al.
Pubblicazione: (2026)

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
di: Liu, Wenzhuo, et al.
Pubblicazione: (2024)

PromptCIR: Blind Compressed Image Restoration with Prompt Learning
di: Li, Bingchen, et al.
Pubblicazione: (2024)

Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision
di: Huang, Xijie, et al.
Pubblicazione: (2023)

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization
di: Cao, Hao, et al.
Pubblicazione: (2026)

QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
di: Chai, Bowen, et al.
Pubblicazione: (2025)

CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
di: Wang, Xingrui, et al.
Pubblicazione: (2024)

LLM-FP4: 4-Bit Floating-Point Quantized Transformers
di: Liu, Shih-yang, et al.
Pubblicazione: (2023)

$S^3$: Synonymous Semantic Space for Improving Zero-Shot Generalization of Vision-Language Models
di: Yin, Xiaojie, et al.
Pubblicazione: (2024)

BitVLA: 1-bit Vision-Language-Action Models for Robotics Manipulation
di: Wang, Hongyu, et al.
Pubblicazione: (2025)