Saved in:
| Main Authors: | Lin, Xingyu, Wen, Yilin, Su, Du, Hou, Jinchang, Wang, En, Liu, Wenbin, Bao, Chenfu, Lv, Zhonghou |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.12736 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
by: Lin, Xingyu, et al.
Published: (2025)
by: Lin, Xingyu, et al.
Published: (2025)
Safety-Utility Conflicts Are Not Global: Surgical Alignment via Head-Level Diagnosis
by: Cai, Wang, et al.
Published: (2026)
by: Cai, Wang, et al.
Published: (2026)
HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark
by: Wang, Jiacheng, et al.
Published: (2026)
by: Wang, Jiacheng, et al.
Published: (2026)
GTPO and GRPO-S: Token and Sequence-Level Reward Shaping with Policy Entropy
by: Tan, Hongze, et al.
Published: (2025)
by: Tan, Hongze, et al.
Published: (2025)
Discriminative Policy Optimization for Token-Level Reward Models
by: Chen, Hongzhan, et al.
Published: (2025)
by: Chen, Hongzhan, et al.
Published: (2025)
Resisting Contextual Interference in RAG via Parametric-Knowledge Reinforcement
by: Lin, Chenyu, et al.
Published: (2025)
by: Lin, Chenyu, et al.
Published: (2025)
Learning to Generate Secure Code via Token-Level Rewards
by: Quan, Jiazheng, et al.
Published: (2026)
by: Quan, Jiazheng, et al.
Published: (2026)
Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
T-REG: Preference Optimization with Token-Level Reward Regularization
by: Zhou, Wenxuan, et al.
Published: (2024)
by: Zhou, Wenxuan, et al.
Published: (2024)
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought
by: Li, Yunheng, et al.
Published: (2026)
by: Li, Yunheng, et al.
Published: (2026)
Selective Preference Optimization via Token-Level Reward Function Estimation
by: Yang, Kailai, et al.
Published: (2024)
by: Yang, Kailai, et al.
Published: (2024)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
by: Wen, Muning, et al.
Published: (2024)
by: Wen, Muning, et al.
Published: (2024)
RED: Unleashing Token-Level Rewards from Holistic Feedback via Reward Redistribution
by: Li, Jiahui, et al.
Published: (2024)
by: Li, Jiahui, et al.
Published: (2024)
TokenShapley: Token Level Context Attribution with Shapley Value
by: Xiao, Yingtai, et al.
Published: (2025)
by: Xiao, Yingtai, et al.
Published: (2025)
TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching
by: Nguyen, Truong, et al.
Published: (2026)
by: Nguyen, Truong, et al.
Published: (2026)
TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization
by: Zhu, Mingkang, et al.
Published: (2025)
by: Zhu, Mingkang, et al.
Published: (2025)
Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence
by: Li, ChengYou, et al.
Published: (2026)
by: Li, ChengYou, et al.
Published: (2026)
Distilling Token-Trained Models into Byte-Level Models
by: Bao, Zishuo, et al.
Published: (2026)
by: Bao, Zishuo, et al.
Published: (2026)
ERPO: Token-Level Entropy-Regulated Policy Optimization for Large Reasoning Models
by: Yu, Song, et al.
Published: (2026)
by: Yu, Song, et al.
Published: (2026)
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation
by: Wei, Jingxuan, et al.
Published: (2024)
by: Wei, Jingxuan, et al.
Published: (2024)
Evolutionary Token-Level Prompt Optimization for Diffusion Models
by: Neto, Domício Pereira, et al.
Published: (2026)
by: Neto, Domício Pereira, et al.
Published: (2026)
Asymmetric On-Policy Distillation: Bridging Exploitation and Imitation at the Token Level
by: Jia, Nan, et al.
Published: (2026)
by: Jia, Nan, et al.
Published: (2026)
Token-Level LLM Collaboration via FusionRoute
by: Xiong, Nuoya, et al.
Published: (2026)
by: Xiong, Nuoya, et al.
Published: (2026)
Rethinking Personalization in Large Language Models at the Token Level
by: Zhang, Chenheng, et al.
Published: (2026)
by: Zhang, Chenheng, et al.
Published: (2026)
TLDR: Token-Level Detective Reward Model for Large Vision Language Models
by: Fu, Deqing, et al.
Published: (2024)
by: Fu, Deqing, et al.
Published: (2024)
TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models
by: Choo, Jinho, et al.
Published: (2026)
by: Choo, Jinho, et al.
Published: (2026)
ResT: Reshaping Token-Level Policy Gradients for Tool-Use Large Language Models
by: Lin, Zihan, et al.
Published: (2025)
by: Lin, Zihan, et al.
Published: (2025)
Beyond BEV: Optimizing Point-Level Tokens for Collaborative Perception
by: Li, Yang, et al.
Published: (2025)
by: Li, Yang, et al.
Published: (2025)
Bi-Level Optimization for Generative Recommendation: Bridging Tokenization and Generation
by: Bai, Yimeng, et al.
Published: (2025)
by: Bai, Yimeng, et al.
Published: (2025)
Scalable Token-Level Hallucination Detection in Large Language Models
by: Min, Rui, et al.
Published: (2026)
by: Min, Rui, et al.
Published: (2026)
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
by: Lin, Zicheng, et al.
Published: (2024)
by: Lin, Zicheng, et al.
Published: (2024)
Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models
by: Zhang, Xiang, et al.
Published: (2025)
by: Zhang, Xiang, et al.
Published: (2025)
Token-Level Graphs for Short Text Classification
by: Donabauer, Gregor, et al.
Published: (2024)
by: Donabauer, Gregor, et al.
Published: (2024)
Towards Token-Level Text Anomaly Detection
by: Cao, Yang, et al.
Published: (2026)
by: Cao, Yang, et al.
Published: (2026)
Token-Level Privacy in Large Language Models
by: Harel, Re'em, et al.
Published: (2025)
by: Harel, Re'em, et al.
Published: (2025)
Improbable Bigrams Expose Vulnerabilities of Incomplete Tokens in Byte-Level Tokenizers
by: Jang, Eugene, et al.
Published: (2024)
by: Jang, Eugene, et al.
Published: (2024)
ProToken: Token-Level Attribution for Federated Large Language Models
by: Gill, Waris, et al.
Published: (2026)
by: Gill, Waris, et al.
Published: (2026)
Pretraining with Token-Level Adaptive Latent Chain-of-Thought
by: Zeng, Boyi, et al.
Published: (2026)
by: Zeng, Boyi, et al.
Published: (2026)
Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding
by: Zhu, Yifan, et al.
Published: (2026)
by: Zhu, Yifan, et al.
Published: (2026)
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity
by: Hsu, Chan-Jan, et al.
Published: (2025)
by: Hsu, Chan-Jan, et al.
Published: (2025)
Similar Items
-
Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
by: Lin, Xingyu, et al.
Published: (2025) -
Safety-Utility Conflicts Are Not Global: Surgical Alignment via Head-Level Diagnosis
by: Cai, Wang, et al.
Published: (2026) -
HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark
by: Wang, Jiacheng, et al.
Published: (2026) -
GTPO and GRPO-S: Token and Sequence-Level Reward Shaping with Policy Entropy
by: Tan, Hongze, et al.
Published: (2025) -
Discriminative Policy Optimization for Token-Level Reward Models
by: Chen, Hongzhan, et al.
Published: (2025)