Saved in:
| Main Authors: | Yao, Yuncheng, Xia, Yuxuan, Wang, Shengjie, Zhuo, Danyang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.04263 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training
by: Wang, Xi, et al.
Published: (2026)
by: Wang, Xi, et al.
Published: (2026)
TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026)
by: Wang, Zhuoyu, et al.
Published: (2026)
DREAM-R: Multimodal Speculative Reasoning with RL-Based Refined Drafting, Precise Verification, and Fully Parallel Execution
by: Hu, Yunhai, et al.
Published: (2026)
by: Hu, Yunhai, et al.
Published: (2026)
HilbertA: Hilbert Attention for Image Generation with Diffusion Models
by: Zheng, Shaoyi, et al.
Published: (2025)
by: Zheng, Shaoyi, et al.
Published: (2025)
ToMA: Token Merge with Attention for Diffusion Models
by: Lu, Wenbo, et al.
Published: (2025)
by: Lu, Wenbo, et al.
Published: (2025)
VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping
by: Dong, Haotian, et al.
Published: (2025)
by: Dong, Haotian, et al.
Published: (2025)
Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs
by: Athiwaratkun, Ben, et al.
Published: (2024)
by: Athiwaratkun, Ben, et al.
Published: (2024)
PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation
by: Liu, Yuhan, et al.
Published: (2025)
by: Liu, Yuhan, et al.
Published: (2025)
D-PACE: Dynamic Position-Aware Cross-Entropy for Parallel Speculative Drafting
by: Wu, Tianyu, et al.
Published: (2026)
by: Wu, Tianyu, et al.
Published: (2026)
Traversal Verification for Speculative Tree Decoding
by: Weng, Yepeng, et al.
Published: (2025)
by: Weng, Yepeng, et al.
Published: (2025)
PACER: Blockwise Pre-verification for Speculative Decoding with Adaptive Length
by: Zhang, Situo, et al.
Published: (2026)
by: Zhang, Situo, et al.
Published: (2026)
SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism
by: Shen, Yuhao, et al.
Published: (2025)
by: Shen, Yuhao, et al.
Published: (2025)
Draft Model Knows When to Stop: Self-Verification Speculative Decoding for Long-Form Generation
by: Zhang, Ziyin, et al.
Published: (2024)
by: Zhang, Ziyin, et al.
Published: (2024)
Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support
by: Du, Alexander, et al.
Published: (2026)
by: Du, Alexander, et al.
Published: (2026)
PrefixGPT: Prefix Adder Optimization by a Generative Pre-trained Transformer
by: Ding, Ruogu, et al.
Published: (2025)
by: Ding, Ruogu, et al.
Published: (2025)
PrefixLLM: LLM-aided Prefix Circuit Design
by: Xiao, Weihua, et al.
Published: (2024)
by: Xiao, Weihua, et al.
Published: (2024)
WISV: Wireless-Informed Semantic Verification for Distributed Speculative Decoding in Device-Edge LLM Inference
by: Liu, Zixuan, et al.
Published: (2026)
by: Liu, Zixuan, et al.
Published: (2026)
Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding
by: Su, Xin, et al.
Published: (2026)
by: Su, Xin, et al.
Published: (2026)
Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding
by: Zhou, Yuxuan, et al.
Published: (2026)
by: Zhou, Yuxuan, et al.
Published: (2026)
Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward
by: Liu, Zikang, et al.
Published: (2025)
by: Liu, Zikang, et al.
Published: (2025)
First Ask Then Answer: A Framework Design for AI Dialogue Based on Supplementary Questioning with Large Language Models
by: Fu, Chuanruo, et al.
Published: (2025)
by: Fu, Chuanruo, et al.
Published: (2025)
Hypothesize-Then-Verify: Speculative Root Cause Analysis for Microservices with Pathwise Parallelism
by: Zhang, Lingzhe, et al.
Published: (2026)
by: Zhang, Lingzhe, et al.
Published: (2026)
PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization
by: Zuo, Dongsheng, et al.
Published: (2025)
by: Zuo, Dongsheng, et al.
Published: (2025)
LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification
by: Yang, Penghui, et al.
Published: (2025)
by: Yang, Penghui, et al.
Published: (2025)
SelfJudge: Faster Speculative Decoding via Self-Supervised Judge Verification
by: Yoon, Kanghoon, et al.
Published: (2025)
by: Yoon, Kanghoon, et al.
Published: (2025)
ConfSpec: Efficient Step-Level Speculative Reasoning via Confidence-Gated Verification
by: Liu, Siran, et al.
Published: (2026)
by: Liu, Siran, et al.
Published: (2026)
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference
by: Timor, Nadav, et al.
Published: (2024)
by: Timor, Nadav, et al.
Published: (2024)
PARD-2: Target-Aligned Parallel Draft Model for Dual-Mode Speculative Decoding
by: An, Zihao, et al.
Published: (2026)
by: An, Zihao, et al.
Published: (2026)
DIVERSED: Relaxed Speculative Decoding via Dynamic Ensemble Verification
by: Wang, Ziyi, et al.
Published: (2026)
by: Wang, Ziyi, et al.
Published: (2026)
Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning
by: Jiang, Yuan, et al.
Published: (2025)
by: Jiang, Yuan, et al.
Published: (2025)
SAM Decoding: Speculative Decoding via Suffix Automaton
by: Hu, Yuxuan, et al.
Published: (2024)
by: Hu, Yuxuan, et al.
Published: (2024)
Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets
by: Chen, Changjian, et al.
Published: (2024)
by: Chen, Changjian, et al.
Published: (2024)
SPECTRE: Hybrid Ordinary-Parallel Speculative Serving for Resource-Efficient LLM Inference
by: Xie, Jincheng, et al.
Published: (2026)
by: Xie, Jincheng, et al.
Published: (2026)
Pipeline Parallelism is All You Need for Optimized Early-Exit Based Self-Speculative Decoding
by: Li, Ruanjun, et al.
Published: (2025)
by: Li, Ruanjun, et al.
Published: (2025)
Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification
by: Wan, Yuxuan, et al.
Published: (2026)
by: Wan, Yuxuan, et al.
Published: (2026)
Layered LA-MAPF: a decomposition of large agent MAPF instance to accelerate solving without compromising solvability
by: Yao, Zhuo
Published: (2024)
by: Yao, Zhuo
Published: (2024)
Generating Visual Stories with Grounded and Coreferent Characters
by: Liu, Danyang, et al.
Published: (2024)
by: Liu, Danyang, et al.
Published: (2024)
Accelerating Large Language Model Reasoning via Speculative Search
by: Wang, Zhihai, et al.
Published: (2025)
by: Wang, Zhihai, et al.
Published: (2025)
Speculative Decoding for Multi-Sample Inference
by: Li, Yiwei, et al.
Published: (2025)
by: Li, Yiwei, et al.
Published: (2025)
Similar Items
-
Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training
by: Wang, Xi, et al.
Published: (2026) -
TAPS: Target-Aware Prefix Tree Selection for Diffusion-Drafted Speculative Decoding
by: Wang, Zhuoyu, et al.
Published: (2026) -
DREAM-R: Multimodal Speculative Reasoning with RL-Based Refined Drafting, Precise Verification, and Fully Parallel Execution
by: Hu, Yunhai, et al.
Published: (2026) -
HilbertA: Hilbert Attention for Image Generation with Diffusion Models
by: Zheng, Shaoyi, et al.
Published: (2025) -
ToMA: Token Merge with Attention for Diffusion Models
by: Lu, Wenbo, et al.
Published: (2025)