Saved in:
| Main Authors: | Wang, Yanshu, Yang, Tong, Liang, Xiyan, Wang, Guoan, Lu, Hanning, Zhe, Xu, Li, Yaoming, Weitao, Li |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.11650 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information
by: Wang, Yanshu, et al.
Published: (2024)
by: Wang, Yanshu, et al.
Published: (2024)
HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026)
by: Wang, Guoan, et al.
Published: (2026)
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025)
by: Hu, Xing, et al.
Published: (2025)
ReFocus: Reinforcing Mid-Frequency and Key-Frequency Modeling for Multivariate Time Series Forecasting
by: Yu, Guoqi, et al.
Published: (2025)
by: Yu, Guoqi, et al.
Published: (2025)
Binary Autoencoder for Mechanistic Interpretability of Large Language Models
by: Cho, Hakaze, et al.
Published: (2025)
by: Cho, Hakaze, et al.
Published: (2025)
Theory-optimal Quantization Based on Flatness
by: Huang, Xiusheng, et al.
Published: (2026)
by: Huang, Xiusheng, et al.
Published: (2026)
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
by: Ke, Wenjin, et al.
Published: (2025)
by: Ke, Wenjin, et al.
Published: (2025)
Frequency-Forcing: From Scaling-as-Time to Soft Frequency Guidance
by: Du, Weitao
Published: (2026)
by: Du, Weitao
Published: (2026)
WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control
by: Aghabozorgi, Mehran, et al.
Published: (2026)
by: Aghabozorgi, Mehran, et al.
Published: (2026)
IB-GRPO: Aligning LLM-based Learning Path Recommendation with Educational Objectives via Indicator-Based Group Relative Policy Optimization
by: Wang, Shuai, et al.
Published: (2026)
by: Wang, Shuai, et al.
Published: (2026)
Towards Better Generalization via Distributional Input Projection Network
by: Hao, Yifan, et al.
Published: (2025)
by: Hao, Yifan, et al.
Published: (2025)
Latent Chain-of-Thought? Decoding the Depth-Recurrent Transformer
by: Lu, Wenquan, et al.
Published: (2025)
by: Lu, Wenquan, et al.
Published: (2025)
Advancing Multimodal In-Context Learning in Large Vision-Language Models with Task-aware Demonstrations
by: Li, Yanshu
Published: (2025)
by: Li, Yanshu
Published: (2025)
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in $\{\pm 1, \pm i\}$
by: Wang, Feiyu, et al.
Published: (2025)
by: Wang, Feiyu, et al.
Published: (2025)
PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models
by: Xiao, He, et al.
Published: (2025)
by: Xiao, He, et al.
Published: (2025)
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery
by: Song, Xiaoshuai, et al.
Published: (2024)
by: Song, Xiaoshuai, et al.
Published: (2024)
A Comprehensive Study on Quantization Techniques for Large Language Models
by: Lang, Jiedong, et al.
Published: (2024)
by: Lang, Jiedong, et al.
Published: (2024)
BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models
by: He, Liulu, et al.
Published: (2025)
by: He, Liulu, et al.
Published: (2025)
Post Training Quantization of Large Language Models with Microscaling Formats
by: Sharify, Sayeh, et al.
Published: (2024)
by: Sharify, Sayeh, et al.
Published: (2024)
Scaling and Transferability of Annealing Strategies in Large Language Model Training
by: Wang, Siqi, et al.
Published: (2025)
by: Wang, Siqi, et al.
Published: (2025)
A Comprehensive Data-centric Overview of Federated Graph Learning
by: Wu, Zhengyu, et al.
Published: (2025)
by: Wu, Zhengyu, et al.
Published: (2025)
Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs
by: Li, Qingyuan, et al.
Published: (2024)
by: Li, Qingyuan, et al.
Published: (2024)
Task-Stratified Knowledge Scaling Laws for Post-Training Quantized Large Language Models
by: Zhou, Chenxi, et al.
Published: (2025)
by: Zhou, Chenxi, et al.
Published: (2025)
PAPR in Motion: Seamless Point-level 3D Scene Interpolation
by: Peng, Shichong, et al.
Published: (2024)
by: Peng, Shichong, et al.
Published: (2024)
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
by: Lee, Jung Hyun, et al.
Published: (2024)
by: Lee, Jung Hyun, et al.
Published: (2024)
ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
by: Zhao, Weibo, et al.
Published: (2024)
by: Zhao, Weibo, et al.
Published: (2024)
An Overview of Large Language Models for Statisticians
by: Ji, Wenlong, et al.
Published: (2025)
by: Ji, Wenlong, et al.
Published: (2025)
Unlock the Potential of Large Language Models for Predictive Tabular Tasks in Data Science with Table-Specific Pretraining
by: Yang, Yazheng, et al.
Published: (2024)
by: Yang, Yazheng, et al.
Published: (2024)
On the Compressibility of Quantized Large Language Models
by: Mao, Yu, et al.
Published: (2024)
by: Mao, Yu, et al.
Published: (2024)
Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment
by: Wang, Xubin, et al.
Published: (2025)
by: Wang, Xubin, et al.
Published: (2025)
Scaling over Scaling: Exploring Test-Time Scaling Plateau in Large Reasoning Models
by: Wang, Jian, et al.
Published: (2025)
by: Wang, Jian, et al.
Published: (2025)
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
by: Li, Chenyue, et al.
Published: (2025)
by: Li, Chenyue, et al.
Published: (2025)
TwinTac: A Wide-Range, Highly Sensitive Tactile Sensor with Real-to-Sim Digital Twin Sensor Model
by: Huang, Xiyan, et al.
Published: (2025)
by: Huang, Xiyan, et al.
Published: (2025)
Continual Learning of Large Language Models: A Comprehensive Survey
by: Shi, Haizhou, et al.
Published: (2024)
by: Shi, Haizhou, et al.
Published: (2024)
Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models
by: Chen, Kejia, et al.
Published: (2025)
by: Chen, Kejia, et al.
Published: (2025)
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
by: Zhao, Xinyu, et al.
Published: (2024)
by: Zhao, Xinyu, et al.
Published: (2024)
RAICL: Retrieval-Augmented In-Context Learning for Vision-Language-Model Based EEG Seizure Detection
by: Li, Siyang, et al.
Published: (2026)
by: Li, Siyang, et al.
Published: (2026)
Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)
by: Zhao, Guoliang, et al.
Published: (2025)
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
by: Liu, Wenyuan, et al.
Published: (2024)
by: Liu, Wenyuan, et al.
Published: (2024)
QET: Enhancing Quantized LLM Parameters and KV cache Compression through Element Substitution and Residual Clustering
by: Wang, Yanshu, et al.
Published: (2024)
by: Wang, Yanshu, et al.
Published: (2024)
Similar Items
-
Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information
by: Wang, Yanshu, et al.
Published: (2024) -
HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026) -
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025) -
ReFocus: Reinforcing Mid-Frequency and Key-Frequency Modeling for Multivariate Time Series Forecasting
by: Yu, Guoqi, et al.
Published: (2025) -
Binary Autoencoder for Mechanistic Interpretability of Large Language Models
by: Cho, Hakaze, et al.
Published: (2025)