Saved in:
| Main Authors: | Ye, Yushi, Hong, Feng, Zheng, Huangjie, Chen, Xu, Chen, Zhiyong, Wang, Yanfeng, Yao, Jiangchao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.22868 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
by: Hong, Feng, et al.
Published: (2025)
by: Hong, Feng, et al.
Published: (2025)
Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers
by: Zeng, Fanqin, et al.
Published: (2026)
by: Zeng, Fanqin, et al.
Published: (2026)
DLLM Agent: See Farther, Run Faster
by: Zhen, Huiling, et al.
Published: (2026)
by: Zhen, Huiling, et al.
Published: (2026)
Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding
by: Li, Xinyu, et al.
Published: (2026)
by: Li, Xinyu, et al.
Published: (2026)
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
by: Gong, Shansan, et al.
Published: (2025)
by: Gong, Shansan, et al.
Published: (2025)
Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)
by: Zhou, Zhihan, et al.
Published: (2025)
Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States
by: Dong, Ximing, et al.
Published: (2026)
by: Dong, Ximing, et al.
Published: (2026)
Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach
by: Li, Haolin, et al.
Published: (2026)
by: Li, Haolin, et al.
Published: (2026)
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
by: Hu, Wenbo, et al.
Published: (2025)
by: Hu, Wenbo, et al.
Published: (2025)
Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
by: Huang, Jianuo, et al.
Published: (2025)
by: Huang, Jianuo, et al.
Published: (2025)
Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild
by: Zheng, Mao, et al.
Published: (2026)
by: Zheng, Mao, et al.
Published: (2026)
SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
by: Long, Lingkun, et al.
Published: (2025)
by: Long, Lingkun, et al.
Published: (2025)
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
by: Zheng, Wenhao, et al.
Published: (2025)
by: Zheng, Wenhao, et al.
Published: (2025)
Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection
by: Chowdhury, Anjir Ahmed, et al.
Published: (2026)
by: Chowdhury, Anjir Ahmed, et al.
Published: (2026)
Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
by: Zhou, Mingyuan, et al.
Published: (2024)
by: Zhou, Mingyuan, et al.
Published: (2024)
Targeted Remasking: Replacing Token Editing with Token-to-Mask Refinement in Discrete Diffusion Language Models
by: Yao, Lin
Published: (2026)
by: Yao, Lin
Published: (2026)
Fast Best-of-N Decoding via Speculative Rejection
by: Sun, Hanshi, et al.
Published: (2024)
by: Sun, Hanshi, et al.
Published: (2024)
Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models
by: Yao, Lin
Published: (2026)
by: Yao, Lin
Published: (2026)
Token Level Routing Inference System for Edge Devices
by: She, Jianshu, et al.
Published: (2025)
by: She, Jianshu, et al.
Published: (2025)
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)
by: Wu, Wei, et al.
Published: (2024)
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
by: Zhong, Linhao, et al.
Published: (2026)
by: Zhong, Linhao, et al.
Published: (2026)
RewardFlow: Topology-Aware Reward Propagation on State Graphs for Agentic RL with Large Language Models
by: Feng, Xiao, et al.
Published: (2026)
by: Feng, Xiao, et al.
Published: (2026)
Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026)
by: Wu, Zimeng, et al.
Published: (2026)
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
by: Cheng, Luyao, et al.
Published: (2023)
by: Cheng, Luyao, et al.
Published: (2023)
ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models
by: Zheng, Kangjie, et al.
Published: (2025)
by: Zheng, Kangjie, et al.
Published: (2025)
Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
by: Ziashahabi, Amir, et al.
Published: (2025)
by: Ziashahabi, Amir, et al.
Published: (2025)
Emotion-Cause Pair Extraction in Conversations via Semantic Decoupling and Graph Alignment
by: Ma, Tianxiang, et al.
Published: (2026)
by: Ma, Tianxiang, et al.
Published: (2026)
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
by: Su, Zhenpeng, et al.
Published: (2024)
by: Su, Zhenpeng, et al.
Published: (2024)
Multi-Modal Prototypes for Open-World Semantic Segmentation
by: Yang, Yuhuan, et al.
Published: (2023)
by: Yang, Yuhuan, et al.
Published: (2023)
Token Masking Improves Transformer-Based Text Classification
by: Xu, Xianglong, et al.
Published: (2025)
by: Xu, Xianglong, et al.
Published: (2025)
HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation
by: Deng, Zewei, et al.
Published: (2026)
by: Deng, Zewei, et al.
Published: (2026)
See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference of Video LLMs
by: Ji, Yicheng, et al.
Published: (2026)
by: Ji, Yicheng, et al.
Published: (2026)
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
by: Su, DiJia, et al.
Published: (2025)
by: Su, DiJia, et al.
Published: (2025)
SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
by: He, Yinhan, et al.
Published: (2025)
by: He, Yinhan, et al.
Published: (2025)
Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach
by: Yang, Jingyuan, et al.
Published: (2025)
by: Yang, Jingyuan, et al.
Published: (2025)
SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
by: Liu, Chengbo, et al.
Published: (2024)
by: Liu, Chengbo, et al.
Published: (2024)
S$^4$C: Speculative Sampling with Syntactic and Semantic Coherence for Efficient Inference of Large Language Models
by: He, Tao, et al.
Published: (2025)
by: He, Tao, et al.
Published: (2025)
Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models
by: Sun, Chendong, et al.
Published: (2025)
by: Sun, Chendong, et al.
Published: (2025)
Rethinking How to Remember: Beyond Atomic Facts in Lifelong LLM Agent Memory
by: Sun, Jingwei, et al.
Published: (2026)
by: Sun, Jingwei, et al.
Published: (2026)
Similar Items
-
Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
by: Hong, Feng, et al.
Published: (2025) -
Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers
by: Zeng, Fanqin, et al.
Published: (2026) -
DLLM Agent: See Farther, Run Faster
by: Zhen, Huiling, et al.
Published: (2026) -
Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding
by: Li, Xinyu, et al.
Published: (2026) -
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
by: Gong, Shansan, et al.
Published: (2025)