:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Singh, Manpreet, Sajjad, Hassan
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2508.16785
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Quantifying the Capabilities of LLMs across Scale and Precision
by: Badshah, Sher, et al.
Published: (2024)

Cross-Layer Discrete Concept Discovery for Interpreting Language Models
by: Garg, Ankur, et al.
Published: (2025)

LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering
by: Wong, Sing Hieng, et al.
Published: (2026)

Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
by: Paglieri, Davide, et al.
Published: (2024)

Low-Rank Quantization-Aware Training for LLMs
by: Bondarenko, Yelysei, et al.
Published: (2024)

Task-Circuit Quantization: Leveraging Knowledge Localization and Interpretability for Compression
by: Xiao, Hanqi, et al.
Published: (2025)

ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
by: Butler, Landon, et al.
Published: (2025)

Continuous Approximations for Improving Quantization Aware Training of LLMs
by: Li, He, et al.
Published: (2024)

Neurons Speak in Ranges: Breaking Free from Discrete Neuronal Attribution
by: Haider, Muhammad Umair, et al.
Published: (2025)

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
by: Huang, Wei, et al.
Published: (2024)

Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
by: Liu, Yijun, et al.
Published: (2024)

Crafting Interpretable Embeddings by Asking LLMs Questions
by: Benara, Vinamra, et al.
Published: (2024)

RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
by: Sung, Yi-Lin, et al.
Published: (2025)

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
by: Cheng, Wenhua, et al.
Published: (2023)

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
by: Shen, Xuan, et al.
Published: (2023)

SmoothRot: Combining Channel-Wise Scaling and Rotation for Quantization-Friendly LLMs
by: Czakó, Patrik, et al.
Published: (2025)

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization
by: Xiong, Boya, et al.
Published: (2025)

MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)

Certifying Knowledge Comprehension in LLMs
by: Chaudhary, Isha, et al.
Published: (2024)

GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
by: Zhou, Sifan, et al.
Published: (2025)

RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
by: Huang, Xijie, et al.
Published: (2024)

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
by: Lv, Keyu, et al.
Published: (2026)

Intrinsic Self-Correction in LLMs: Towards Explainable Prompting via Mechanistic Interpretability
by: Lee, Yu-Ting, et al.
Published: (2025)

Rethinking Interpretability in the Era of Large Language Models
by: Singh, Chandan, et al.
Published: (2024)

Do LLMs Encode Functional Importance of Reasoning Tokens?
by: Singh, Janvijay, et al.
Published: (2026)

Automated Unity Game Template Generation from GDDs via NLP and Multi-Modal LLMs
by: Hassan, Amna
Published: (2025)

Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
by: Jan, Essa, et al.
Published: (2025)

LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)

RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations
by: Su, Zunhai, et al.
Published: (2025)

Nudging: Inference-time Alignment of LLMs via Guided Decoding
by: Fei, Yu, et al.
Published: (2024)

Interpretable Next-token Prediction via the Generalized Induction Head
by: Kim, Eunji, et al.
Published: (2024)

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
by: Singh, Joykirat, et al.
Published: (2024)

Draft-Conditioned Constrained Decoding for Structured Generation in LLMs
by: Reddy, Avinash, et al.
Published: (2026)

Minimal and Mechanistic Conditions for Behavioral Self-Awareness in LLMs
by: Bozoukov, Matthew, et al.
Published: (2025)

A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs
by: Mersha, Melkamu Abay, et al.
Published: (2025)

Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation
by: Manvi, Rohin, et al.
Published: (2024)

From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?
by: Mueller, Aaron, et al.
Published: (2025)

Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency
by: Husom, Erik Johannes, et al.
Published: (2025)

How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
by: Jaipersaud, Brandon, et al.
Published: (2025)

Turning LLM Activations Quantization-Friendly
by: Czakó, Patrik, et al.
Published: (2025)