:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Chang, Ting-Yun, Zhang, Muru, Thomason, Jesse, Jia, Robin
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Machine Learning Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2506.12044
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

LLM Unlearning Without an Expert Curated Dataset
von: Zhu, Xiaoyuan, et al.
Veröffentlicht: (2025)

Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
von: Zhang, Xi, et al.
Veröffentlicht: (2025)

InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
von: Li, Ke, et al.
Veröffentlicht: (2026)

SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion
von: Hu, Zizhao, et al.
Veröffentlicht: (2026)

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
von: Zhang, Tianao, et al.
Veröffentlicht: (2025)

Multi-modal Synthetic Data Training and Model Collapse: Insights from VLMs and Diffusion Models
von: Hu, Zizhao, et al.
Veröffentlicht: (2025)

Breaking the Blocks: Continuous Low-Rank Decomposed Scaling for Unified LLM Quantization and Adaptation
von: Tang, Pingzhi, et al.
Veröffentlicht: (2026)

M3PT: A Transformer for Multimodal, Multi-Party Social Signal Prediction with Person-aware Blockwise Attention
von: Tang, Yiming, et al.
Veröffentlicht: (2025)

When Bits Break Recourse: Counterfactual-Faithful Quantization
von: Yahyati, Chaymae, et al.
Veröffentlicht: (2026)

LittleBit: Ultra Low-Bit Quantization via Latent Factorization
von: Lee, Banseok, et al.
Veröffentlicht: (2025)

Q-Palette: Fractional-Bit Quantizers Toward Optimal Bit Allocation for Efficient LLM Deployment
von: Lee, Deokjae, et al.
Veröffentlicht: (2025)

BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization
von: Zhao, Jiayu, et al.
Veröffentlicht: (2026)

SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
von: Song, Jaewoo, et al.
Veröffentlicht: (2025)

Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision
von: Huang, Xijie, et al.
Veröffentlicht: (2023)

ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs
von: Becking, Daniel, et al.
Veröffentlicht: (2021)

HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
von: Zhang, Jinhao Zhang Yunquan, et al.
Veröffentlicht: (2026)

I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
von: Hu, Xing, et al.
Veröffentlicht: (2024)

UltraSketchLLM: Saliency-Driven Sketching for Ultra-Low Bit LLM Compression
von: Zou, Sunan, et al.
Veröffentlicht: (2025)

SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
von: Song, Jaewoo, et al.
Veröffentlicht: (2025)

Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
von: Li, Zhen, et al.
Veröffentlicht: (2025)

SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
von: Zhao, Zhixiong, et al.
Veröffentlicht: (2025)

Attn-QAT: 4-Bit Attention With Quantization-Aware Training
von: Zhang, Peiyuan, et al.
Veröffentlicht: (2026)

Exploiting LLM Quantization
von: Egashira, Kazuki, et al.
Veröffentlicht: (2024)

PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
von: Zhao, Jiaqi, et al.
Veröffentlicht: (2025)

NSNQuant: A Double Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache
von: Son, Donghyun, et al.
Veröffentlicht: (2025)

HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
von: Wang, Guoan, et al.
Veröffentlicht: (2026)

Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
von: Pavlov, Gorgi
Veröffentlicht: (2026)

RDKV: Rate-Distortion Bit Allocation for Joint Eviction and Quantization of the KV Cache
von: Zhang, Junkai, et al.
Veröffentlicht: (2026)

AdaQAT: Adaptive Bit-Width Quantization-Aware Training
von: Gernigon, Cédric, et al.
Veröffentlicht: (2024)

MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
von: Wang, Dongwei, et al.
Veröffentlicht: (2026)

Widening the Gap: Exploiting LLM Quantization via Outlier Injection
von: Zhan, Xiaohua, et al.
Veröffentlicht: (2026)

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
von: Lv, Keyu, et al.
Veröffentlicht: (2026)

Why Does RLAIF Work At All?
von: Young, Robin
Veröffentlicht: (2026)

AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations
von: Tao, Qian, et al.
Veröffentlicht: (2024)

HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs
von: Chen, Ningning, et al.
Veröffentlicht: (2025)

PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies
von: Zhang, Jesse, et al.
Veröffentlicht: (2025)

Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection
von: Anwar, Abrar, et al.
Veröffentlicht: (2025)

THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation
von: Pumacay, Wilbert, et al.
Veröffentlicht: (2024)

Verification of Bit-Flip Attacks against Quantized Neural Networks
von: Zhang, Yedi, et al.
Veröffentlicht: (2025)

MARR: Module-Adaptive Residual Reconstruction for Low-Bit Post-Training Quantization
von: Su, Le, et al.
Veröffentlicht: (2026)