:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ye, Yushi, Hong, Feng, Zheng, Huangjie, Chen, Xu, Chen, Zhiyong, Wang, Yanfeng, Yao, Jiangchao
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2602.22868
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs
by: Hong, Feng, et al.
Published: (2025)

Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers
by: Zeng, Fanqin, et al.
Published: (2026)

DLLM Agent: See Farther, Run Faster
by: Zhen, Huiling, et al.
Published: (2026)

Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding
by: Li, Xinyu, et al.
Published: (2026)

DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
by: Gong, Shansan, et al.
Published: (2025)

Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)

Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States
by: Dong, Ximing, et al.
Published: (2026)

Eliciting Medical Reasoning with Knowledge-enhanced Data Synthesis: A Semi-Supervised Reinforcement Learning Approach
by: Li, Haolin, et al.
Published: (2026)

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
by: Hu, Wenbo, et al.
Published: (2025)

Mask Tokens as Prophet: Fine-Grained Cache Eviction for Efficient dLLM Inference
by: Huang, Jianuo, et al.
Published: (2025)

Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild
by: Zheng, Mao, et al.
Published: (2026)

SlimInfer: Accelerating Long-Context LLM Inference via Dynamic Token Pruning
by: Long, Lingkun, et al.
Published: (2025)

CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
by: Zheng, Wenhao, et al.
Published: (2025)

Know When To Fold 'Em: Token-Efficient LLM Synthetic Data Generation via Multi-Stage In-Flight Rejection
by: Chowdhury, Anjir Ahmed, et al.
Published: (2026)

Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation
by: Zhou, Mingyuan, et al.
Published: (2024)

Targeted Remasking: Replacing Token Editing with Token-to-Mask Refinement in Discrete Diffusion Language Models
by: Yao, Lin
Published: (2026)

Fast Best-of-N Decoding via Speculative Rejection
by: Sun, Hanshi, et al.
Published: (2024)

Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models
by: Yao, Lin
Published: (2026)

Token Level Routing Inference System for Edge Devices
by: She, Jianshu, et al.
Published: (2025)

TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)

Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models
by: Zhong, Linhao, et al.
Published: (2026)

RewardFlow: Topology-Aware Reward Propagation on State Graphs for Agentic RL with Large Language Models
by: Feng, Xiao, et al.
Published: (2026)

Probe and Skip: Self-Predictive Token Skipping for Efficient Long-Context LLM Inference
by: Wu, Zimeng, et al.
Published: (2026)

Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
by: Cheng, Luyao, et al.
Published: (2023)

ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models
by: Zheng, Kangjie, et al.
Published: (2025)

Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)

Reject Only Critical Tokens: Pivot-Aware Speculative Decoding
by: Ziashahabi, Amir, et al.
Published: (2025)

Emotion-Cause Pair Extraction in Conversations via Semantic Decoupling and Graph Alignment
by: Ma, Tianxiang, et al.
Published: (2026)

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts
by: Su, Zhenpeng, et al.
Published: (2024)

Multi-Modal Prototypes for Open-World Semantic Segmentation
by: Yang, Yuhuan, et al.
Published: (2023)

Token Masking Improves Transformer-Based Text Classification
by: Xu, Xianglong, et al.
Published: (2025)

HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation
by: Deng, Zewei, et al.
Published: (2026)

See the Forest for the Trees: Loosely Speculative Decoding via Visual-Semantic Guidance for Efficient Inference of Video LLMs
by: Ji, Yicheng, et al.
Published: (2026)

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
by: Su, DiJia, et al.
Published: (2025)

SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens
by: He, Yinhan, et al.
Published: (2025)

Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach
by: Yang, Jingyuan, et al.
Published: (2025)

SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
by: Liu, Chengbo, et al.
Published: (2024)

S$^4$C: Speculative Sampling with Syntactic and Semantic Coherence for Efficient Inference of Large Language Models
by: He, Tao, et al.
Published: (2025)

Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models
by: Sun, Chendong, et al.
Published: (2025)

Rethinking How to Remember: Beyond Atomic Facts in Lifelong LLM Agent Memory
by: Sun, Jingwei, et al.
Published: (2026)