Saved in:
| Main Authors: | Li, Zhikai, Li, Jiatong, Liu, Xuewen, Zhao, Wangbo, Du, Pan, Zhou, Kaicheng, Gu, Qingyi, You, Yang, Dong, Zhen, Keutzer, Kurt |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.09411 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
by: Li, Zhikai, et al.
Published: (2024)
by: Li, Zhikai, et al.
Published: (2024)
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
by: Li, Zhikai, et al.
Published: (2023)
by: Li, Zhikai, et al.
Published: (2023)
Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval
by: Zhang, Jing, et al.
Published: (2026)
by: Zhang, Jing, et al.
Published: (2026)
Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models
by: Li, Zhikai, et al.
Published: (2026)
by: Li, Zhikai, et al.
Published: (2026)
CacheQuant: Comprehensively Accelerated Diffusion Models
by: Liu, Xuewen, et al.
Published: (2025)
by: Liu, Xuewen, et al.
Published: (2025)
OSAQ: Outlier Self-Absorption for Accurate Low-bit LLM Quantization
by: Li, Zhikai, et al.
Published: (2026)
by: Li, Zhikai, et al.
Published: (2026)
PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models
by: Liu, Xuewen, et al.
Published: (2026)
by: Liu, Xuewen, et al.
Published: (2026)
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
by: Li, Zhikai, et al.
Published: (2024)
by: Li, Zhikai, et al.
Published: (2024)
Rectified SpaAttn: Revisiting Attention Sparsity for Efficient Video Generation
by: Liu, Xuewen, et al.
Published: (2025)
by: Liu, Xuewen, et al.
Published: (2025)
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation
by: Liu, Xuewen, et al.
Published: (2024)
by: Liu, Xuewen, et al.
Published: (2024)
SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
by: Zhang, Jing, et al.
Published: (2025)
by: Zhang, Jing, et al.
Published: (2025)
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare
by: Li, Zhikai, et al.
Published: (2024)
by: Li, Zhikai, et al.
Published: (2024)
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models
by: Liu, Xuewen, et al.
Published: (2024)
by: Liu, Xuewen, et al.
Published: (2024)
Sparsity Induction for Accurate Post-Training Pruning of Large Language Models
by: Jiang, Minhao, et al.
Published: (2026)
by: Jiang, Minhao, et al.
Published: (2026)
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
by: Zhao, Wangbo, et al.
Published: (2024)
by: Zhao, Wangbo, et al.
Published: (2024)
A Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation
by: Zhao, Yi, et al.
Published: (2026)
by: Zhao, Yi, et al.
Published: (2026)
LLM Inference Unveiled: Survey and Roofline Model Insights
by: Yuan, Zhihang, et al.
Published: (2024)
by: Yuan, Zhihang, et al.
Published: (2024)
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference
by: Zhang, Yuan, et al.
Published: (2024)
by: Zhang, Yuan, et al.
Published: (2024)
MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction
by: Yang, Lianwei, et al.
Published: (2024)
by: Yang, Lianwei, et al.
Published: (2024)
TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation
by: Xiao, Junrui, et al.
Published: (2024)
by: Xiao, Junrui, et al.
Published: (2024)
Unsupervised Learning for Class Distribution Mismatch
by: Du, Pan, et al.
Published: (2025)
by: Du, Pan, et al.
Published: (2025)
EPIM: Efficient Processing-In-Memory Accelerators based on Epitome
by: Wang, Chenyu, et al.
Published: (2023)
by: Wang, Chenyu, et al.
Published: (2023)
MITRA: A Large-Scale Parallel Corpus and Multilingual Pretrained Language Model for Machine Translation and Semantic Retrieval for Pāli, Sanskrit, Buddhist Chinese, and Tibetan
by: Nehrdich, Sebastian, et al.
Published: (2026)
by: Nehrdich, Sebastian, et al.
Published: (2026)
IAG: Input-aware Backdoor Attack on VLM-based Visual Grounding
by: Li, Junxian, et al.
Published: (2025)
by: Li, Junxian, et al.
Published: (2025)
Stochastic Communication Avoidance for Recommendation Systems
by: Erdogan, Lutfi Eren, et al.
Published: (2024)
by: Erdogan, Lutfi Eren, et al.
Published: (2024)
MCQA-Eval: Efficient Confidence Evaluation in NLG with Gold-Standard Correctness Labels
by: Liu, Xiaoou, et al.
Published: (2025)
by: Liu, Xiaoou, et al.
Published: (2025)
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration
by: Park, ChaeHun, et al.
Published: (2024)
by: Park, ChaeHun, et al.
Published: (2024)
DQRM: Deep Quantized Recommendation Models
by: Zhou, Yang, et al.
Published: (2024)
by: Zhou, Yang, et al.
Published: (2024)
Ideal Sorting: An Extension of Preference Sorting
by: Millena Ayres Silva, et al.
Published: (2026)
by: Millena Ayres Silva, et al.
Published: (2026)
UniDrive: Towards Universal Driving Perception Across Camera Configurations
by: Li, Ye, et al.
Published: (2024)
by: Li, Ye, et al.
Published: (2024)
Visual observation of optical Floquet-Bloch oscillations
by: Zhang, Zhen, et al.
Published: (2022)
by: Zhang, Zhen, et al.
Published: (2022)
One Model is All You Need: ByT5-Sanskrit, a Unified Model for Sanskrit NLP Tasks
by: Nehrdich, Sebastian, et al.
Published: (2024)
by: Nehrdich, Sebastian, et al.
Published: (2024)
Simple and Effective Input Reformulations for Translation
by: Yu, Brian, et al.
Published: (2023)
by: Yu, Brian, et al.
Published: (2023)
CausalEval: Towards Better Causal Reasoning in Language Models
by: Yu, Longxuan, et al.
Published: (2024)
by: Yu, Longxuan, et al.
Published: (2024)
MultEval: Supporting Collaborative Alignment for LLM-as-a-Judge Evaluation Criteria
by: Chiang, Charles, et al.
Published: (2026)
by: Chiang, Charles, et al.
Published: (2026)
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
by: Yang, Shuo, et al.
Published: (2026)
by: Yang, Shuo, et al.
Published: (2026)
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
by: Zhang, Zicheng, et al.
Published: (2025)
by: Zhang, Zicheng, et al.
Published: (2025)
JudgeFlow: Agentic Workflow Optimization via Block Judge
by: Ma, Zihan, et al.
Published: (2026)
by: Ma, Zihan, et al.
Published: (2026)
SqueezeLLM: Dense-and-Sparse Quantization
by: Kim, Sehoon, et al.
Published: (2023)
by: Kim, Sehoon, et al.
Published: (2023)
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
by: Li, Haitao, et al.
Published: (2024)
by: Li, Haitao, et al.
Published: (2024)
Similar Items
-
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences
by: Li, Zhikai, et al.
Published: (2024) -
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
by: Li, Zhikai, et al.
Published: (2023) -
Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval
by: Zhang, Jing, et al.
Published: (2026) -
Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models
by: Li, Zhikai, et al.
Published: (2026) -
CacheQuant: Comprehensively Accelerated Diffusion Models
by: Liu, Xuewen, et al.
Published: (2025)