Saved in:
| Main Authors: | Rasquinha, Mitchelle, Tabak, Gil |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.02897 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MoR: Mixture Of Representations For Mixed-Precision Training
by: Su, Bor-Yiing, et al.
Published: (2025)
by: Su, Bor-Yiing, et al.
Published: (2025)
Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training
by: Varshney, Ayush K., et al.
Published: (2026)
by: Varshney, Ayush K., et al.
Published: (2026)
From Noise to Precision: A Diffusion-Driven Approach to Zero-Inflated Precipitation Prediction
by: Gao, Wentao, et al.
Published: (2025)
by: Gao, Wentao, et al.
Published: (2025)
Training Time Prediction for Mixed Precision-based Distributed Training
by: Kang, Minchul, et al.
Published: (2026)
by: Kang, Minchul, et al.
Published: (2026)
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks
by: Gimenes, Pedro, et al.
Published: (2025)
by: Gimenes, Pedro, et al.
Published: (2025)
Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation
by: Yuan, Jinsheng, et al.
Published: (2024)
by: Yuan, Jinsheng, et al.
Published: (2024)
STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
by: Federici, Marco, et al.
Published: (2025)
by: Federici, Marco, et al.
Published: (2025)
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Models
by: Guan, Ziyi, et al.
Published: (2024)
by: Guan, Ziyi, et al.
Published: (2024)
Identifying Backdoored Graphs in Graph Neural Network Training: An Explanation-Based Approach with Novel Metrics
by: Downer, Jane, et al.
Published: (2024)
by: Downer, Jane, et al.
Published: (2024)
MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
by: Liu, Wenyuan, et al.
Published: (2025)
by: Liu, Wenyuan, et al.
Published: (2025)
Mixed-Precision Quantization for Language Models: Techniques and Prospects
by: Rakka, Mariam, et al.
Published: (2025)
by: Rakka, Mariam, et al.
Published: (2025)
MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning
by: Zhang, Tao, et al.
Published: (2025)
by: Zhang, Tao, et al.
Published: (2025)
Random Cloud: Finding Minimal Neural Architectures Without Training
by: Blázquez, Javier Gil
Published: (2026)
by: Blázquez, Javier Gil
Published: (2026)
APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs
by: Bouzouad, Meriem, et al.
Published: (2026)
by: Bouzouad, Meriem, et al.
Published: (2026)
A Riemannian Approach to Ground Metric Learning for Optimal Transport
by: Jawanpuria, Pratik, et al.
Published: (2024)
by: Jawanpuria, Pratik, et al.
Published: (2024)
PAHQ: Accelerating Automated Circuit Discovery through Mixed-Precision Inference Optimization
by: Wang, Xinhai, et al.
Published: (2025)
by: Wang, Xinhai, et al.
Published: (2025)
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
by: Gautam, Arpit Singh, et al.
Published: (2026)
by: Gautam, Arpit Singh, et al.
Published: (2026)
Balancing Fidelity and Plasticity: Aligning Mixed-Precision Fine-Tuning with Linguistic Hierarchies
by: Zhou, Changhai, et al.
Published: (2025)
by: Zhou, Changhai, et al.
Published: (2025)
ScaleBITS: Scalable Bitwidth Search for Hardware-Aligned Mixed-Precision LLMs
by: Li, Xinlin, et al.
Published: (2026)
by: Li, Xinlin, et al.
Published: (2026)
GAMMA: Global Bit Allocation for Mixed-Precision Models under Arbitrary Budgets
by: Yao, Zhangyang, et al.
Published: (2026)
by: Yao, Zhangyang, et al.
Published: (2026)
Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
by: Ding, Yifu, et al.
Published: (2026)
by: Ding, Yifu, et al.
Published: (2026)
Predict Training Data Quality via Its Geometry in Metric Space
by: Ba, Yang, et al.
Published: (2025)
by: Ba, Yang, et al.
Published: (2025)
A KL Lens on Quantization: Fast, Forward-Only Sensitivity for Mixed-Precision SSM-Transformer Models
by: Kong, Jason, et al.
Published: (2026)
by: Kong, Jason, et al.
Published: (2026)
Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities
by: Hao, Zhiwei, et al.
Published: (2025)
by: Hao, Zhiwei, et al.
Published: (2025)
Why Low-Precision Transformer Training Fails: An Analysis on Flash Attention
by: Qiu, Haiquan, et al.
Published: (2025)
by: Qiu, Haiquan, et al.
Published: (2025)
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation
by: Zhang, Shu-Hao, et al.
Published: (2026)
by: Zhang, Shu-Hao, et al.
Published: (2026)
MUXQ: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank Outlier Decomposition
by: Lee, Seoungsub, et al.
Published: (2026)
by: Lee, Seoungsub, et al.
Published: (2026)
MP-ISMoE: Mixed-Precision Interactive Side Mixture-of-Experts for Efficient Transfer Learning
by: Zhang, Yutong, et al.
Published: (2026)
by: Zhang, Yutong, et al.
Published: (2026)
On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks
by: Huang, Wei, et al.
Published: (2023)
by: Huang, Wei, et al.
Published: (2023)
Rank-Aware Spectral Bounds on Attention Logits for Stable Low-Precision Training
by: Emadi, Seyed Morteza
Published: (2026)
by: Emadi, Seyed Morteza
Published: (2026)
No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
by: Yang, June Yong, et al.
Published: (2024)
by: Yang, June Yong, et al.
Published: (2024)
CONF-KV: Confidence-Aware KV Cache Eviction with Mixed-Precision Storage for Long-Horizon LLM
by: Li, Yubo, et al.
Published: (2026)
by: Li, Yubo, et al.
Published: (2026)
VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting
by: Kang, Junhyeok, et al.
Published: (2025)
by: Kang, Junhyeok, et al.
Published: (2025)
Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG
by: Han, Seungju, et al.
Published: (2026)
by: Han, Seungju, et al.
Published: (2026)
MixGCN: Scalable GCN Training by Mixture of Parallelism and Mixture of Accelerators
by: Wan, Cheng, et al.
Published: (2025)
by: Wan, Cheng, et al.
Published: (2025)
To FP8 and Back Again: Quantifying Reduced Precision Effects on LLM Training Stability
by: Lee, Joonhyung, et al.
Published: (2024)
by: Lee, Joonhyung, et al.
Published: (2024)
Beyond Precision: Training-Inference Mismatch is an Optimization Problem and Simple LR Scheduling Fixes It
by: Zhang, Yaxiang, et al.
Published: (2026)
by: Zhang, Yaxiang, et al.
Published: (2026)
A Metric-based Principal Curve Approach for Learning One-dimensional Manifold
by: Cuicizion, Eliuvish
Published: (2024)
by: Cuicizion, Eliuvish
Published: (2024)
MDGMIX: Boundary-Aware Subgraph Mixing for Multi-Domain Graph Pre-Training
by: Zheng, Ziyu, et al.
Published: (2026)
by: Zheng, Ziyu, et al.
Published: (2026)
GAC: Noise-Aware Adaptive Mixing for Hybrid SFT-RL Post-Training
by: Hu, Yuelin, et al.
Published: (2026)
by: Hu, Yuelin, et al.
Published: (2026)
Similar Items
-
MoR: Mixture Of Representations For Mixed-Precision Training
by: Su, Bor-Yiing, et al.
Published: (2025) -
Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training
by: Varshney, Ayush K., et al.
Published: (2026) -
From Noise to Precision: A Diffusion-Driven Approach to Zero-Inflated Precipitation Prediction
by: Gao, Wentao, et al.
Published: (2025) -
Training Time Prediction for Mixed Precision-based Distributed Training
by: Kang, Minchul, et al.
Published: (2026) -
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks
by: Gimenes, Pedro, et al.
Published: (2025)