Saved in:
| Main Authors: | Zhu, Mingye, Liu, Yi, Fu, Zheren, Wang, Quan, Zhang, Yongdong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.09865 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts
by: Zhu, Mingye, et al.
Published: (2025)
by: Zhu, Mingye, et al.
Published: (2025)
DACL-RAG: Data Augmentation Strategy with Curriculum Learning for Retrieval-Augmented Generation
by: Wang, Shaohan, et al.
Published: (2025)
by: Wang, Shaohan, et al.
Published: (2025)
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
by: Liu, Yi, et al.
Published: (2025)
by: Liu, Yi, et al.
Published: (2025)
FlipGuard: Defending Preference Alignment against Update Regression with Constrained Optimization
by: Zhu, Mingye, et al.
Published: (2024)
by: Zhu, Mingye, et al.
Published: (2024)
SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
by: Liu, Dengcan, et al.
Published: (2025)
by: Liu, Dengcan, et al.
Published: (2025)
Concise Reasoning via Reinforcement Learning
by: Fatemi, Mehdi, et al.
Published: (2025)
by: Fatemi, Mehdi, et al.
Published: (2025)
Towards Concise and Adaptive Thinking in Large Reasoning Models: A Survey
by: Zhu, Jason, et al.
Published: (2025)
by: Zhu, Jason, et al.
Published: (2025)
On-the-fly Preference Alignment via Principle-Guided Decoding
by: Zhu, Mingye, et al.
Published: (2025)
by: Zhu, Mingye, et al.
Published: (2025)
Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning
by: Song, Mingyang, et al.
Published: (2025)
by: Song, Mingyang, et al.
Published: (2025)
Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning
by: Liu, Hanbing, et al.
Published: (2025)
by: Liu, Hanbing, et al.
Published: (2025)
ConciseHint: Boosting Efficient Reasoning via Continuous Concise Hints during Generation
by: Tang, Siao, et al.
Published: (2025)
by: Tang, Siao, et al.
Published: (2025)
Training LLM-Based Agents with Synthetic Self-Reflected Trajectories and Partial Masking
by: Chen, Yihan, et al.
Published: (2025)
by: Chen, Yihan, et al.
Published: (2025)
Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost
by: Nayab, Sania, et al.
Published: (2024)
by: Nayab, Sania, et al.
Published: (2024)
Self-signals Driven Multi-LLM Debate for Efficient and Accurate Reasoning
by: Chen, Xuhang, et al.
Published: (2025)
by: Chen, Xuhang, et al.
Published: (2025)
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
by: Huang, Chengyu, et al.
Published: (2025)
by: Huang, Chengyu, et al.
Published: (2025)
LIRE: listwise reward enhancement for preference alignment
by: Zhu, Mingye, et al.
Published: (2024)
by: Zhu, Mingye, et al.
Published: (2024)
Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents
by: Liu, Yuxin, et al.
Published: (2026)
by: Liu, Yuxin, et al.
Published: (2026)
Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation
by: Zhang, Ruiqi, et al.
Published: (2026)
by: Zhang, Ruiqi, et al.
Published: (2026)
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
by: Li, Zheng, et al.
Published: (2025)
by: Li, Zheng, et al.
Published: (2025)
Self-Training Elicits Concise Reasoning in Large Language Models
by: Munkhbat, Tergel, et al.
Published: (2025)
by: Munkhbat, Tergel, et al.
Published: (2025)
Token-Guard: Towards Token-Level Hallucination Control via Self-Checking Decoding
by: Zhu, Yifan, et al.
Published: (2026)
by: Zhu, Yifan, et al.
Published: (2026)
Learning to Reason via Self-Iterative Process Feedback for Small Language Models
by: Chen, Kaiyuan, et al.
Published: (2024)
by: Chen, Kaiyuan, et al.
Published: (2024)
Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
by: Li, Chen, et al.
Published: (2025)
by: Li, Chen, et al.
Published: (2025)
Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness
by: Ma, Shixuan, et al.
Published: (2024)
by: Ma, Shixuan, et al.
Published: (2024)
Steering Large Reasoning Models towards Concise Reasoning via Flow Matching
by: Li, Yawei, et al.
Published: (2026)
by: Li, Yawei, et al.
Published: (2026)
Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation
by: Li, Jiaang, et al.
Published: (2026)
by: Li, Jiaang, et al.
Published: (2026)
Concise and Organized Perception Facilitates Reasoning in Large Language Models
by: Liu, Junjie, et al.
Published: (2023)
by: Liu, Junjie, et al.
Published: (2023)
Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies
by: Wang, Junlin, et al.
Published: (2024)
by: Wang, Junlin, et al.
Published: (2024)
Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning
by: Shrivastava, Vaishnavi, et al.
Published: (2025)
by: Shrivastava, Vaishnavi, et al.
Published: (2025)
Accurate KV Cache Quantization with Outlier Tokens Tracing
by: Su, Yi, et al.
Published: (2025)
by: Su, Yi, et al.
Published: (2025)
Preference Optimization for Reasoning with Pseudo Feedback
by: Jiao, Fangkai, et al.
Published: (2024)
by: Jiao, Fangkai, et al.
Published: (2024)
How Do Answer Tokens Read Reasoning Traces? Self-Reading Patterns in Thinking LLMs for Quantitative Reasoning
by: Chen, Haoyang, et al.
Published: (2026)
by: Chen, Haoyang, et al.
Published: (2026)
HAMburger: Accelerating LLM Inference via Token Smashing
by: Liu, Jingyu, et al.
Published: (2025)
by: Liu, Jingyu, et al.
Published: (2025)
AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback
by: Gao, Zhitao, et al.
Published: (2026)
by: Gao, Zhitao, et al.
Published: (2026)
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
by: Liu, Quan, et al.
Published: (2024)
by: Liu, Quan, et al.
Published: (2024)
Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering
by: Zhu, Jiajun, et al.
Published: (2025)
by: Zhu, Jiajun, et al.
Published: (2025)
LLMs are Superior Feedback Providers: Bootstrapping Reasoning for Lie Detection with Self-Generated Feedback
by: Banerjee, Tanushree, et al.
Published: (2024)
by: Banerjee, Tanushree, et al.
Published: (2024)
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
by: Li, Jiaang, et al.
Published: (2024)
by: Li, Jiaang, et al.
Published: (2024)
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
by: Yuan, Siyu, et al.
Published: (2024)
by: Yuan, Siyu, et al.
Published: (2024)
A Study on Leveraging Search and Self-Feedback for Agent Reasoning
by: K, Karthikeyan, et al.
Published: (2025)
by: K, Karthikeyan, et al.
Published: (2025)
Similar Items
-
Leveraging Robust Optimization for LLM Alignment under Distribution Shifts
by: Zhu, Mingye, et al.
Published: (2025) -
DACL-RAG: Data Augmentation Strategy with Curriculum Learning for Retrieval-Augmented Generation
by: Wang, Shaohan, et al.
Published: (2025) -
Leveraging Importance Sampling to Detach Alignment Modules from Large Language Models
by: Liu, Yi, et al.
Published: (2025) -
FlipGuard: Defending Preference Alignment against Update Regression with Constrained Optimization
by: Zhu, Mingye, et al.
Published: (2024) -
SparseRM: A Lightweight Preference Modeling with Sparse Autoencoder
by: Liu, Dengcan, et al.
Published: (2025)