:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yao, Yuncheng, Xia, Yuxuan, Wang, Shengjie, Zhuo, Danyang
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.04263
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training
by: Wang, Xi, et al.
Published: (2026)

TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)

DREAM-R: Multimodal Speculative Reasoning with RL-Based Refined Drafting, Precise Verification, and Fully Parallel Execution
by: Hu, Yunhai, et al.
Published: (2026)

HilbertA: Hilbert Attention for Image Generation with Diffusion Models
by: Zheng, Shaoyi, et al.
Published: (2025)

ToMA: Token Merge with Attention for Diffusion Models
by: Lu, Wenbo, et al.
Published: (2025)

VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping
by: Dong, Haotian, et al.
Published: (2025)

Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs
by: Athiwaratkun, Ben, et al.
Published: (2024)

PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
by: Wang, Haonan, et al.
Published: (2025)

Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation
by: Liu, Yuhan, et al.
Published: (2025)

D-PACE: Dynamic Position-Aware Cross-Entropy for Parallel Speculative Drafting
by: Wu, Tianyu, et al.
Published: (2026)

Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)

PACER: Blockwise Pre-verification for Speculative Decoding with Adaptive Length
by: Zhang, Situo, et al.
Published: (2026)

SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism
by: Shen, Yuhao, et al.
Published: (2025)

Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
by: Zhang, Ziyin, et al.
Published: (2024)

Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support
by: Du, Alexander, et al.
Published: (2026)

PrefixGPT: Prefix Adder Optimization by a Generative Pre-trained Transformer
by: Ding, Ruogu, et al.
Published: (2025)

PrefixLLM: LLM-aided Prefix Circuit Design
by: Xiao, Weihua, et al.
Published: (2024)

WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference
by: Liu, Zixuan, et al.
Published: (2026)

Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding
by: Su, Xin, et al.
Published: (2026)

Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
by: Zhou, Yuxuan, et al.
Published: (2026)

Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward
by: Liu, Zikang, et al.
Published: (2025)

First Ask Then Answer: A Framework Design for AI Dialogue Based on Supplementary Questioning with Large Language Models
by: Fu, Chuanruo, et al.
Published: (2025)

Hypothesize-Then-Verify: Speculative Root Cause Analysis for Microservices with Pathwise Parallelism
by: Zhang, Lingzhe, et al.
Published: (2026)

PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization
by: Zuo, Dongsheng, et al.
Published: (2025)

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
by: Yang, Penghui, et al.
Published: (2025)

SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification
by: Yoon, Kanghoon, et al.
Published: (2025)

ConfSpec: Efficient Step-Level Speculative Reasoning via Confidence-Gated Verification
by: Liu, Siran, et al.
Published: (2026)

Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
by: Timor, Nadav, et al.
Published: (2024)

PARD-2: Target-Aligned Parallel Draft Model for Dual-Mode Speculative Decoding
by: An, Zihao, et al.
Published: (2026)

DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
by: Wang, Ziyi, et al.
Published: (2026)

Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
by: Jiang, Yuan, et al.
Published: (2025)

SAM Decoding: Speculative Decoding via Suffix Automaton
by: Hu, Yuxuan, et al.
Published: (2024)

Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets
by: Chen, Changjian, et al.
Published: (2024)

SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference
by: Xie, Jincheng, et al.
Published: (2026)

Pipeline Parallelism is All You Need for Optimized Early-Exit Based Self-Speculative Decoding
by: Li, Ruanjun, et al.
Published: (2025)

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification
by: Wan, Yuxuan, et al.
Published: (2026)

Layered LA-MAPF: a decomposition of large agent MAPF instance to accelerate solving without compromising solvability
by: Yao, Zhuo
Published: (2024)

Generating Visual Stories with Grounded and Coreferent Characters
by: Liu, Danyang, et al.
Published: (2024)

Accelerating Large Language Model Reasoning via Speculative Search
by: Wang, Zhihai, et al.
Published: (2025)

Speculative Decoding for Multi-Sample Inference
by: Li, Yiwei, et al.
Published: (2025)