:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Shah, Harsh
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning Systems and Control
Online Access:	https://arxiv.org/abs/2510.27641
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection
by: Shukla, Shikhar
Published: (2026)

ProxyAttn: Guided Sparse Attention via Representative Heads
by: Wang, Yixuan, et al.
Published: (2025)

ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
by: Xiao, Zilin, et al.
Published: (2024)

State Space Models as Foundation Models: A Control Theoretic Overview
by: Alonso, Carmen Amo, et al.
Published: (2024)

SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
by: Emami, Patrick, et al.
Published: (2024)

STS: Efficient Sparse Attention with Speculative Token Sparsity
by: Xu, Ceyu, et al.
Published: (2026)

HiSpec: Hierarchical Speculative Decoding for LLMs
by: Kumar, Avinash, et al.
Published: (2025)

AttnCache: Accelerating Self-Attention Inference for LLM Prefill via Attention Cache
by: Song, Dinghong, et al.
Published: (2025)

ACING: Actor-Critic for Instruction Learning in Black-Box LLMs
by: Kharrat, Salma, et al.
Published: (2024)

ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise
by: Guo, Xingang, et al.
Published: (2024)

Fake-Mamba: Real-Time Speech Deepfake Detection Using Bidirectional Mamba as Self-Attention's Alternative
by: Xuan, Xi, et al.
Published: (2025)

SlimSpec: Low-Rank Draft LM-Head for Accelerated Speculative Decoding
by: Plaksin, Anton, et al.
Published: (2026)

DistillSpec: Improving Speculative Decoding via Knowledge Distillation
by: Zhou, Yongchao, et al.
Published: (2023)

LogicGuard: Improving Embodied LLM agents through Temporal Logic based Critics
by: Gokhale, Anand, et al.
Published: (2025)

Fine-tuning Smaller Language Models for Question Answering over Financial Documents
by: Phogat, Karmvir Singh, et al.
Published: (2024)

Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems
by: Chen, Lingjiao, et al.
Published: (2024)

Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement
by: Xie, Guanwen, et al.
Published: (2024)

PreFT: Prefill-only finetuning for efficient inference
by: Lanpouthakoun, Andrew, et al.
Published: (2026)

Most Likely Sequence Generation for $n$-Grams, Transformers, HMMs, and Markov Chains, by Using Rollout Algorithms
by: Li, Yuchao, et al.
Published: (2024)

Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP
by: Bogdanov, Igor, et al.
Published: (2026)

FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast
by: Bogdanov, Igor, et al.
Published: (2026)

Sparse Attention-driven Quality Prediction for Production Process Optimization in Digital Twins
by: Yin, Yanlei, et al.
Published: (2024)

ML-SpecQD: Multi-Level Speculative Decoding with Quantized Drafts
by: Georganas, Evangelos, et al.
Published: (2025)

SpecExit: Accelerating Large Reasoning Model via Speculative Exit
by: Yang, Rubing, et al.
Published: (2025)

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
by: Huang, Kaixuan, et al.
Published: (2024)

Multi-Bin Batching for Increasing LLM Inference Throughput
by: Guldogan, Ozgur, et al.
Published: (2024)

Spatial Language Likelihood Grounding Network for Bayesian Fusion of Human-Robot Observations
by: Sitdhipol, Supawich, et al.
Published: (2025)

On the Relation of State Space Models and Hidden Markov Models
by: Ghojogh, Aydin, et al.
Published: (2026)

A Survey on Large Language Model-empowered Autonomous Driving
by: Zhu, Yuxuan, et al.
Published: (2024)

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
by: Yang, Penghui, et al.
Published: (2025)

SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration
by: Wen, Zhuofan, et al.
Published: (2026)

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations
by: Wang, Yanwei, et al.
Published: (2022)

DynaSpec: Context-aware Dynamic Speculative Sampling for Large-Vocabulary Language Models
by: Zhang, Jinbin, et al.
Published: (2025)

SpecTr: Fast Speculative Decoding via Optimal Transport
by: Sun, Ziteng, et al.
Published: (2023)

SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding
by: Li, Shenggui, et al.
Published: (2026)

AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
by: Achtibat, Reduan, et al.
Published: (2024)

MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
by: Mitra, Purbesh, et al.
Published: (2025)

FALCON: Autonomous Cyber Threat Intelligence Mining with LLMs for IDS Rule Generation
by: Mitra, Shaswata, et al.
Published: (2025)

A Unified Generative-AI Framework for Smart Energy Infrastructure: Intelligent Gas Distribution, Utility Billing, Carbon Analytics, and Quantum-Inspired Optimisation
by: Manjunath, Pavan, et al.
Published: (2026)

Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges
by: Liu, Qin, et al.
Published: (2024)