Saved in:
| Main Authors: | Eldenk, Doğaç, Mohapatra, Payal, Comlek, Yigitcan, Oktay, Kaan, Zhang, Hongyang, Xia, Stephen |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.09992 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SSSD: Simply-Scalable Speculative Decoding
by: Marzollo, Michele, et al.
Published: (2024)
by: Marzollo, Michele, et al.
Published: (2024)
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
by: Zhang, Yizhou, et al.
Published: (2025)
by: Zhang, Yizhou, et al.
Published: (2025)
SAM Decoding: Speculative Decoding via Suffix Automaton
by: Hu, Yuxuan, et al.
Published: (2024)
by: Hu, Yuxuan, et al.
Published: (2024)
Component-Aware Self-Speculative Decoding in Hybrid Language Models
by: Borobia, Hector, et al.
Published: (2026)
by: Borobia, Hector, et al.
Published: (2026)
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)
by: Fadli, Samih
Published: (2025)
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
by: Goldstein, Daniel, et al.
Published: (2025)
by: Goldstein, Daniel, et al.
Published: (2025)
DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
by: Wang, Ziyi, et al.
Published: (2026)
by: Wang, Ziyi, et al.
Published: (2026)
UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models
by: Eldenk, Doğaç, et al.
Published: (2026)
by: Eldenk, Doğaç, et al.
Published: (2026)
SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long Sequences
by: Cha, Jungyoub, et al.
Published: (2025)
by: Cha, Jungyoub, et al.
Published: (2025)
Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text
by: Oketunji, Abiodun Finbarrs
Published: (2023)
by: Oketunji, Abiodun Finbarrs
Published: (2023)
How Does Unfaithful Reasoning Emerge from Autoregressive Training? A Study of Synthetic Experiments
by: Wang, Fuxin, et al.
Published: (2026)
by: Wang, Fuxin, et al.
Published: (2026)
Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models
by: Sun, Chendong, et al.
Published: (2025)
by: Sun, Chendong, et al.
Published: (2025)
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks
by: Nielsen, Dan Saattrup, et al.
Published: (2024)
by: Nielsen, Dan Saattrup, et al.
Published: (2024)
Preventing Safety Drift in Large Language Models via Coupled Weight and Activation Constraints
by: Peng, Songping, et al.
Published: (2026)
by: Peng, Songping, et al.
Published: (2026)
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)
Meta-Learning at Scale for Large Language Models via Low-Rank Amortized Bayesian Meta-Learning
by: Zhang, Liyi, et al.
Published: (2025)
by: Zhang, Liyi, et al.
Published: (2025)
Large Language Model (LLM) Bias Index -- LLMBI
by: Oketunji, Abiodun Finbarrs, et al.
Published: (2023)
by: Oketunji, Abiodun Finbarrs, et al.
Published: (2023)
Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning
by: Mircea, Andrei, et al.
Published: (2025)
by: Mircea, Andrei, et al.
Published: (2025)
A Performance Evaluation of a Quantized Large Language Model on Various Smartphones
by: Çöplü, Tolga, et al.
Published: (2023)
by: Çöplü, Tolga, et al.
Published: (2023)
Forget Attention: Importance-Aware Attention Is All You Need
by: Shin, Soohyeong, et al.
Published: (2026)
by: Shin, Soohyeong, et al.
Published: (2026)
Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents
by: Gao, Heyang, et al.
Published: (2025)
by: Gao, Heyang, et al.
Published: (2025)
Dodo: Dynamic Contextual Compression for Decoder-only LMs
by: Qin, Guanghui, et al.
Published: (2023)
by: Qin, Guanghui, et al.
Published: (2023)
Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors
by: Zhang, Zhiwei, et al.
Published: (2026)
by: Zhang, Zhiwei, et al.
Published: (2026)
Softmax Linear Attention: Reclaiming Global Competition
by: Xu, Mingwei, et al.
Published: (2026)
by: Xu, Mingwei, et al.
Published: (2026)
Beyond Memorization: Violating Privacy Via Inference with Large Language Models
by: Staab, Robin, et al.
Published: (2023)
by: Staab, Robin, et al.
Published: (2023)
Beyond the Black Box: A Statistical Model for LLM Reasoning and Inference
by: Dalal, Siddhartha, et al.
Published: (2024)
by: Dalal, Siddhartha, et al.
Published: (2024)
Learning What Matters: Probabilistic Task Selection via Mutual Information for Model Finetuning
by: Chanda, Prateek, et al.
Published: (2025)
by: Chanda, Prateek, et al.
Published: (2025)
End-to-End Optimization of LLM-Driven Multi-Agent Search Systems via Heterogeneous-Group-Based Reinforcement Learning
by: Chen, Guanzhong, et al.
Published: (2025)
by: Chen, Guanzhong, et al.
Published: (2025)
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
by: Zhang, Zhengxin, et al.
Published: (2024)
by: Zhang, Zhengxin, et al.
Published: (2024)
Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025)
by: Peters, Sydney, et al.
Published: (2025)
How LLMs Are Persuaded: A Few Attention Heads, Rerouted
by: Sun, Xiangkun, et al.
Published: (2026)
by: Sun, Xiangkun, et al.
Published: (2026)
Adversarial Lens: Exploiting Attention Layers to Generate Adversarial Examples for Evaluation
by: Dhole, Kaustubh
Published: (2025)
by: Dhole, Kaustubh
Published: (2025)
From Fake Focus to Real Precision: Confusion-Driven Adversarial Attention Learning in Transformers
by: Liu, Yawei
Published: (2025)
by: Liu, Yawei
Published: (2025)
AtManRL: Towards Faithful Reasoning via Differentiable Attention Saliency
by: Höth, Max Henning, et al.
Published: (2026)
by: Höth, Max Henning, et al.
Published: (2026)
When Words Change the Model: Sensitivity of LLMs for Constraint Programming Modelling
by: Pellegrino, Alessio, et al.
Published: (2025)
by: Pellegrino, Alessio, et al.
Published: (2025)
Shattered Compositionality: Counterintuitive Learning Dynamics of Transformers for Arithmetic
by: Zhao, Xingyu, et al.
Published: (2026)
by: Zhao, Xingyu, et al.
Published: (2026)
MMiC: Mitigating Modality Incompleteness in Clustered Federated Learning
by: Yang, Lishan, et al.
Published: (2025)
by: Yang, Lishan, et al.
Published: (2025)
Social Learning through Interactions with Other Agents: A Survey
by: Hillier, Dylan, et al.
Published: (2024)
by: Hillier, Dylan, et al.
Published: (2024)
Mixture of Attention Spans: Optimizing LLM Inference Efficiency with Heterogeneous Sliding-Window Lengths
by: Fu, Tianyu, et al.
Published: (2024)
by: Fu, Tianyu, et al.
Published: (2024)
A Multi-Encoder Frozen-Decoder Approach for Fine-Tuning Large Language Models
by: Dhole, Kaustubh D.
Published: (2025)
by: Dhole, Kaustubh D.
Published: (2025)
Similar Items
-
SSSD: Simply-Scalable Speculative Decoding
by: Marzollo, Michele, et al.
Published: (2024) -
FastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft Learning
by: Zhang, Yizhou, et al.
Published: (2025) -
SAM Decoding: Speculative Decoding via Suffix Automaton
by: Hu, Yuxuan, et al.
Published: (2024) -
Component-Aware Self-Speculative Decoding in Hybrid Language Models
by: Borobia, Hector, et al.
Published: (2026) -
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)