Saved in:
| Main Authors: | Singh, Manpreet, Sajjad, Hassan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.16785 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Quantifying the Capabilities of LLMs across Scale and Precision
by: Badshah, Sher, et al.
Published: (2024)
by: Badshah, Sher, et al.
Published: (2024)
Cross-Layer Discrete Concept Discovery for Interpreting Language Models
by: Garg, Ankur, et al.
Published: (2025)
by: Garg, Ankur, et al.
Published: (2025)
LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering
by: Wong, Sing Hieng, et al.
Published: (2026)
by: Wong, Sing Hieng, et al.
Published: (2026)
Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
by: Paglieri, Davide, et al.
Published: (2024)
by: Paglieri, Davide, et al.
Published: (2024)
Low-Rank Quantization-Aware Training for LLMs
by: Bondarenko, Yelysei, et al.
Published: (2024)
by: Bondarenko, Yelysei, et al.
Published: (2024)
Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
by: Xiao, Hanqi, et al.
Published: (2025)
by: Xiao, Hanqi, et al.
Published: (2025)
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
by: Butler, Landon, et al.
Published: (2025)
by: Butler, Landon, et al.
Published: (2025)
Continuous Approximations for Improving Quantization Aware Training of LLMs
by: Li, He, et al.
Published: (2024)
by: Li, He, et al.
Published: (2024)
Neurons Speak in Ranges: Breaking Free from Discrete Neuronal Attribution
by: Haider, Muhammad Umair, et al.
Published: (2025)
by: Haider, Muhammad Umair, et al.
Published: (2025)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
by: Liu, Yijun, et al.
Published: (2024)
by: Liu, Yijun, et al.
Published: (2024)
Crafting Interpretable Embeddings by Asking LLMs Questions
by: Benara, Vinamra, et al.
Published: (2024)
by: Benara, Vinamra, et al.
Published: (2024)
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
by: Sung, Yi-Lin, et al.
Published: (2025)
by: Sung, Yi-Lin, et al.
Published: (2025)
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
by: Cheng, Wenhua, et al.
Published: (2023)
by: Cheng, Wenhua, et al.
Published: (2023)
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
by: Shen, Xuan, et al.
Published: (2023)
by: Shen, Xuan, et al.
Published: (2023)
SmoothRot: Combining Channel-Wise Scaling and Rotation for Quantization-Friendly LLMs
by: Czakó, Patrik, et al.
Published: (2025)
by: Czakó, Patrik, et al.
Published: (2025)
Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization
by: Xiong, Boya, et al.
Published: (2025)
by: Xiong, Boya, et al.
Published: (2025)
MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)
by: Zhao, Guojiang, et al.
Published: (2025)
Certifying Knowledge Comprehension in LLMs
by: Chaudhary, Isha, et al.
Published: (2024)
by: Chaudhary, Isha, et al.
Published: (2024)
GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
by: Zhou, Sifan, et al.
Published: (2025)
by: Zhou, Sifan, et al.
Published: (2025)
RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
by: Huang, Xijie, et al.
Published: (2024)
by: Huang, Xijie, et al.
Published: (2024)
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
by: Lv, Keyu, et al.
Published: (2026)
by: Lv, Keyu, et al.
Published: (2026)
Intrinsic Self-Correction in LLMs: Towards Explainable Prompting via Mechanistic Interpretability
by: Lee, Yu-Ting, et al.
Published: (2025)
by: Lee, Yu-Ting, et al.
Published: (2025)
Rethinking Interpretability in the Era of Large Language Models
by: Singh, Chandan, et al.
Published: (2024)
by: Singh, Chandan, et al.
Published: (2024)
Do LLMs Encode Functional Importance of Reasoning Tokens?
by: Singh, Janvijay, et al.
Published: (2026)
by: Singh, Janvijay, et al.
Published: (2026)
Automated Unity Game Template Generation from GDDs via NLP and Multi-Modal LLMs
by: Hassan, Amna
Published: (2025)
by: Hassan, Amna
Published: (2025)
Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
by: Jan, Essa, et al.
Published: (2025)
by: Jan, Essa, et al.
Published: (2025)
LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)
by: Tyen, Gladys, et al.
Published: (2023)
RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations
by: Su, Zunhai, et al.
Published: (2025)
by: Su, Zunhai, et al.
Published: (2025)
Nudging: Inference-time Alignment of LLMs via Guided Decoding
by: Fei, Yu, et al.
Published: (2024)
by: Fei, Yu, et al.
Published: (2024)
Interpretable Next-token Prediction via the Generalized Induction Head
by: Kim, Eunji, et al.
Published: (2024)
by: Kim, Eunji, et al.
Published: (2024)
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
by: Singh, Joykirat, et al.
Published: (2024)
by: Singh, Joykirat, et al.
Published: (2024)
Draft-Conditioned Constrained Decoding for Structured Generation in LLMs
by: Reddy, Avinash, et al.
Published: (2026)
by: Reddy, Avinash, et al.
Published: (2026)
Minimal and Mechanistic Conditions for Behavioral Self-Awareness in LLMs
by: Bozoukov, Matthew, et al.
Published: (2025)
by: Bozoukov, Matthew, et al.
Published: (2025)
A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs
by: Mersha, Melkamu Abay, et al.
Published: (2025)
by: Mersha, Melkamu Abay, et al.
Published: (2025)
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
by: Manvi, Rohin, et al.
Published: (2024)
by: Manvi, Rohin, et al.
Published: (2024)
From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?
by: Mueller, Aaron, et al.
Published: (2025)
by: Mueller, Aaron, et al.
Published: (2025)
Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency
by: Husom, Erik Johannes, et al.
Published: (2025)
by: Husom, Erik Johannes, et al.
Published: (2025)
How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
by: Jaipersaud, Brandon, et al.
Published: (2025)
by: Jaipersaud, Brandon, et al.
Published: (2025)
Turning LLM Activations Quantization-Friendly
by: Czakó, Patrik, et al.
Published: (2025)
by: Czakó, Patrik, et al.
Published: (2025)
Similar Items
-
Quantifying the Capabilities of LLMs across Scale and Precision
by: Badshah, Sher, et al.
Published: (2024) -
Cross-Layer Discrete Concept Discovery for Interpreting Language Models
by: Garg, Ankur, et al.
Published: (2025) -
LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering
by: Wong, Sing Hieng, et al.
Published: (2026) -
Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
by: Paglieri, Davide, et al.
Published: (2024) -
Low-Rank Quantization-Aware Training for LLMs
by: Bondarenko, Yelysei, et al.
Published: (2024)