Saved in:
| Main Authors: | Tian, Zhen, Zhao, Wayne Xin, Wen, Ji-Rong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.12896 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention
by: Tian, Zhen, et al.
Published: (2024)
by: Tian, Zhen, et al.
Published: (2024)
Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration
by: Chen, Zhipeng, et al.
Published: (2026)
by: Chen, Zhipeng, et al.
Published: (2026)
Exploring Context Window of Large Language Models via Decomposed Positional Vectors
by: Dong, Zican, et al.
Published: (2024)
by: Dong, Zican, et al.
Published: (2024)
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
by: Wang, Xiaolei, et al.
Published: (2024)
by: Wang, Xiaolei, et al.
Published: (2024)
OSAQ: Outlier Self-Absorption for Accurate Low-bit LLM Quantization
by: Li, Zhikai, et al.
Published: (2026)
by: Li, Zhikai, et al.
Published: (2026)
LoRaQ: Optimized Low Rank Approximation for 4-bit Quantization
by: Bouquet, Yann, et al.
Published: (2026)
by: Bouquet, Yann, et al.
Published: (2026)
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds
by: Xu, Zhenghao, et al.
Published: (2023)
by: Xu, Zhenghao, et al.
Published: (2023)
Atom: Low-bit Quantization for Efficient and Accurate LLM Serving
by: Zhao, Yilong, et al.
Published: (2023)
by: Zhao, Yilong, et al.
Published: (2023)
SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
by: Ji, Xiaodong, et al.
Published: (2025)
by: Ji, Xiaodong, et al.
Published: (2025)
Degree of Irrationality: Sentiment and Implied Volatility Surface
by: Weng, Jiahao, et al.
Published: (2024)
by: Weng, Jiahao, et al.
Published: (2024)
CCQ: Convolutional Code for Extreme Low-bit Quantization in LLMs
by: Zhou, Zhaojing, et al.
Published: (2025)
by: Zhou, Zhaojing, et al.
Published: (2025)
SageBwd: A Trainable Low-bit Attention
by: Zhang, Jintao, et al.
Published: (2026)
by: Zhang, Jintao, et al.
Published: (2026)
DiRotQ: Rotation-Aware Quantization for 4-bit Diffusion Transformers
by: Sharify, Sayeh, et al.
Published: (2026)
by: Sharify, Sayeh, et al.
Published: (2026)
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression
by: Liu, Peiyu, et al.
Published: (2024)
by: Liu, Peiyu, et al.
Published: (2024)
ICQuant: Index Coding enables Low-bit LLM Quantization
by: Li, Xinlin, et al.
Published: (2025)
by: Li, Xinlin, et al.
Published: (2025)
Towards Low-bit Communication for Tensor Parallel LLM Inference
by: Dong, Harry, et al.
Published: (2024)
by: Dong, Harry, et al.
Published: (2024)
Towards Effective Code-Integrated Reasoning
by: Bai, Fei, et al.
Published: (2025)
by: Bai, Fei, et al.
Published: (2025)
ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
by: Liu, Zechun, et al.
Published: (2025)
by: Liu, Zechun, et al.
Published: (2025)
MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026)
by: Wang, Jiapeng, et al.
Published: (2026)
Stabilizing Backpropagation in 16-bit Neural Training with Modified Adam Optimizer
by: Yun, Juyoung
Published: (2023)
by: Yun, Juyoung
Published: (2023)
A Black Swan Hypothesis: The Role of Human Irrationality in AI Safety
by: Lee, Hyunin, et al.
Published: (2024)
by: Lee, Hyunin, et al.
Published: (2024)
More Than Irrational: Modeling Belief-Biased Agents
by: Zhu, Yifan, et al.
Published: (2025)
by: Zhu, Yifan, et al.
Published: (2025)
Learning Universal Multi-level Market Irrationality Factors to Improve Stock Return Forecasting
by: Yang, Chen, et al.
Published: (2025)
by: Yang, Chen, et al.
Published: (2025)
Low-bit Model Quantization for Deep Neural Networks: A Survey
by: Liu, Kai, et al.
Published: (2025)
by: Liu, Kai, et al.
Published: (2025)
Regulatory DNA sequence Design with Reinforcement Learning
by: Yang, Zhao, et al.
Published: (2025)
by: Yang, Zhao, et al.
Published: (2025)
Interpretable Enzyme Function Prediction via Residue-Level Detection
by: Yang, Zhao, et al.
Published: (2025)
by: Yang, Zhao, et al.
Published: (2025)
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
by: Wang, Sheng, et al.
Published: (2024)
by: Wang, Sheng, et al.
Published: (2024)
On the Low-Complexity of Fair Learning for Combinatorial Multi-Armed Bandit
by: Wu, Xiaoyi, et al.
Published: (2025)
by: Wu, Xiaoyi, et al.
Published: (2025)
Rethinking Channel Dimensions to Isolate Outliers for Low-bit Weight Quantization of Large Language Models
by: Heo, Jung Hwan, et al.
Published: (2023)
by: Heo, Jung Hwan, et al.
Published: (2023)
TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs
by: Sun, Yan, et al.
Published: (2025)
by: Sun, Yan, et al.
Published: (2025)
Sequential 1-bit Mean Estimation with Near-Optimal Sample Complexity
by: Lau, Ivan, et al.
Published: (2025)
by: Lau, Ivan, et al.
Published: (2025)
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
by: Zhou, Zhongzhu, et al.
Published: (2026)
by: Zhou, Zhongzhu, et al.
Published: (2026)
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
by: Liu, Kai, et al.
Published: (2024)
by: Liu, Kai, et al.
Published: (2024)
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]
by: Jiang, Jiawei, et al.
Published: (2023)
by: Jiang, Jiawei, et al.
Published: (2023)
PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for Traffic Flow Prediction
by: Jiang, Jiawei, et al.
Published: (2023)
by: Jiang, Jiawei, et al.
Published: (2023)
ChainLM: Empowering Large Language Models with Improved Chain-of-Thought Prompting
by: Cheng, Xiaoxue, et al.
Published: (2024)
by: Cheng, Xiaoxue, et al.
Published: (2024)
D$^2$Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
by: Yan, Xianglong, et al.
Published: (2026)
by: Yan, Xianglong, et al.
Published: (2026)
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning
by: Dong, Guanting, et al.
Published: (2025)
by: Dong, Guanting, et al.
Published: (2025)
Beyond Discreteness: Sample Complexity Analysis of Straight-Through Estimator for 1-bit Quantization
by: Jeong, Halyun, et al.
Published: (2025)
by: Jeong, Halyun, et al.
Published: (2025)
Memory-Efficient 4-bit Preconditioned Stochastic Optimization
by: Li, Jingyang, et al.
Published: (2024)
by: Li, Jingyang, et al.
Published: (2024)
Similar Items
-
EulerFormer: Sequential User Behavior Modeling with Complex Vector Attention
by: Tian, Zhen, et al.
Published: (2024) -
Low-rank Optimization Trajectories Modeling for LLM RLVR Acceleration
by: Chen, Zhipeng, et al.
Published: (2026) -
Exploring Context Window of Large Language Models via Decomposed Positional Vectors
by: Dong, Zican, et al.
Published: (2024) -
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
by: Wang, Xiaolei, et al.
Published: (2024) -
OSAQ: Outlier Self-Absorption for Accurate Low-bit LLM Quantization
by: Li, Zhikai, et al.
Published: (2026)