Saved in:
| Main Authors: | Elhoushi, Mostafa, Johnson, Jeff |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.04610 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
by: Gong, Linyuan, et al.
Published: (2024)
by: Gong, Linyuan, et al.
Published: (2024)
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
by: Xia, Junhao, et al.
Published: (2025)
by: Xia, Junhao, et al.
Published: (2025)
PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context
by: Augustin, Maximilian, et al.
Published: (2024)
by: Augustin, Maximilian, et al.
Published: (2024)
4bit-Quantization in Vector-Embedding for RAG
by: Jeong, Taehee
Published: (2025)
by: Jeong, Taehee
Published: (2025)
Matmul or No Matmul in the Era of 1-bit LLMs
by: Malekar, Jinendra, et al.
Published: (2024)
by: Malekar, Jinendra, et al.
Published: (2024)
4-bit Shampoo for Memory-Efficient Network Training
by: Wang, Sike, et al.
Published: (2024)
by: Wang, Sike, et al.
Published: (2024)
Towards Low-bit Communication for Tensor Parallel LLM Inference
by: Dong, Harry, et al.
Published: (2024)
by: Dong, Harry, et al.
Published: (2024)
Toward Temporal Causal Representation Learning with Tensor Decomposition
by: Chen, Jianhong, et al.
Published: (2025)
by: Chen, Jianhong, et al.
Published: (2025)
Guiding Giants: Lightweight Controllers for Weighted Activation Steering in LLMs
by: Hegazy, Amr, et al.
Published: (2025)
by: Hegazy, Amr, et al.
Published: (2025)
D$^2$Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
by: Yan, Xianglong, et al.
Published: (2026)
by: Yan, Xianglong, et al.
Published: (2026)
Representation Learning on Hyper-Relational and Numeric Knowledge Graphs with Transformers
by: Chung, Chanyoung, et al.
Published: (2023)
by: Chung, Chanyoung, et al.
Published: (2023)
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs
by: Li, Ang, et al.
Published: (2025)
by: Li, Ang, et al.
Published: (2025)
Forward Only Learning for Orthogonal Neural Networks of any Depth
by: Caillon, Paul, et al.
Published: (2025)
by: Caillon, Paul, et al.
Published: (2025)
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
by: Kang, Feiyang, et al.
Published: (2025)
by: Kang, Feiyang, et al.
Published: (2025)
Prompted Policy Search: Reinforcement Learning through Linguistic and Numerical Reasoning in LLMs
by: Zhou, Yifan, et al.
Published: (2025)
by: Zhou, Yifan, et al.
Published: (2025)
Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains
by: Rinberg, Roy, et al.
Published: (2026)
by: Rinberg, Roy, et al.
Published: (2026)
Scalable Numerical Embeddings for Multivariate Time Series: Enhancing Healthcare Data Representation Learning
by: Huang, Chun-Kai, et al.
Published: (2024)
by: Huang, Chun-Kai, et al.
Published: (2024)
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
by: Lee, Jung Hyun, et al.
Published: (2025)
by: Lee, Jung Hyun, et al.
Published: (2025)
Towards Effective Theory of LLMs: A Representation Learning Approach
by: Ustaomeroglu, Muhammed, et al.
Published: (2026)
by: Ustaomeroglu, Muhammed, et al.
Published: (2026)
Eliciting Numerical Predictive Distributions of LLMs Without Autoregression
by: Piskorz, Julianna, et al.
Published: (2026)
by: Piskorz, Julianna, et al.
Published: (2026)
Exploiting Latent Linearity in LLMs Improves Explainable Molecular Representation Learning
by: Li, Zhuoran, et al.
Published: (2024)
by: Li, Zhuoran, et al.
Published: (2024)
LLMs for Supply Chain Management
by: Wang, Haojie, et al.
Published: (2025)
by: Wang, Haojie, et al.
Published: (2025)
Gym-Anything: Turn any Software into an Agent Environment
by: Aggarwal, Pranjal, et al.
Published: (2026)
by: Aggarwal, Pranjal, et al.
Published: (2026)
SageBwd: A Trainable Low-bit Attention
by: Zhang, Jintao, et al.
Published: (2026)
by: Zhang, Jintao, et al.
Published: (2026)
LLMs can see and hear without any training
by: Ashutosh, Kumar, et al.
Published: (2025)
by: Ashutosh, Kumar, et al.
Published: (2025)
When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models
by: Zhang, Michael S., et al.
Published: (2025)
by: Zhang, Michael S., et al.
Published: (2025)
Schema-Adaptive Tabular Representation Learning with LLMs for Generalizable Multimodal Clinical Reasoning
by: Mao, Hongxi, et al.
Published: (2026)
by: Mao, Hongxi, et al.
Published: (2026)
Learning with Expert Abstractions for Efficient Multi-Task Continuous Control
by: Jewett, Jeff, et al.
Published: (2025)
by: Jewett, Jeff, et al.
Published: (2025)
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
by: Nielsen, Jacob, et al.
Published: (2025)
by: Nielsen, Jacob, et al.
Published: (2025)
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
by: Hu, Shengran, et al.
Published: (2023)
by: Hu, Shengran, et al.
Published: (2023)
Reinforcing Numerical Reasoning in LLMs for Tabular Prediction via Structural Priors
by: Cai, Pengxiang, et al.
Published: (2025)
by: Cai, Pengxiang, et al.
Published: (2025)
ICQuant: Index Coding enables Low-bit LLM Quantization
by: Li, Xinlin, et al.
Published: (2025)
by: Li, Xinlin, et al.
Published: (2025)
DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models
by: Sankarapu, Vinay Kumar, et al.
Published: (2024)
by: Sankarapu, Vinay Kumar, et al.
Published: (2024)
First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs
by: Norman, Ben, et al.
Published: (2023)
by: Norman, Ben, et al.
Published: (2023)
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
by: Elhoushi, Mostafa, et al.
Published: (2024)
by: Elhoushi, Mostafa, et al.
Published: (2024)
Synthetic Data for any Differentiable Target
by: Thrush, Tristan, et al.
Published: (2026)
by: Thrush, Tristan, et al.
Published: (2026)
A Fragile Number Sense: Probing the Elemental Limits of Numerical Reasoning in LLMs
by: Rahman, Roussel, et al.
Published: (2025)
by: Rahman, Roussel, et al.
Published: (2025)
Low-bit Model Quantization for Deep Neural Networks: A Survey
by: Liu, Kai, et al.
Published: (2025)
by: Liu, Kai, et al.
Published: (2025)
HyperMono: A Monotonicity-aware Approach to Hyper-Relational Knowledge Representation
by: Hu, Zhiwei, et al.
Published: (2024)
by: Hu, Zhiwei, et al.
Published: (2024)
FP4 All the Way: Fully Quantized Training of LLMs
by: Chmiel, Brian, et al.
Published: (2025)
by: Chmiel, Brian, et al.
Published: (2025)
Similar Items
-
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
by: Gong, Linyuan, et al.
Published: (2024) -
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
by: Xia, Junhao, et al.
Published: (2025) -
PETAH: Parameter Efficient Task Adaptation for Hybrid Transformers in a resource-limited Context
by: Augustin, Maximilian, et al.
Published: (2024) -
4bit-Quantization in Vector-Embedding for RAG
by: Jeong, Taehee
Published: (2025) -
Matmul or No Matmul in the Era of 1-bit LLMs
by: Malekar, Jinendra, et al.
Published: (2024)