Saved in:
| Main Authors: | Zheng, Tong, Liu, Haolin, Huang, Chengsong, Bao, Huiwen, Zhang, Sheng, Liu, Rui, Dai, Runpeng, Chen, Ruibo, Liu, Chenxi, Xiong, Tianyi, Wu, Xidong, Zhang, Hongming, Huang, Heng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.08083 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
by: Zheng, Tong, et al.
Published: (2025)
by: Zheng, Tong, et al.
Published: (2025)
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing
by: Zheng, Tong, et al.
Published: (2026)
by: Zheng, Tong, et al.
Published: (2026)
Learning from Self-Debate: Preparing Reasoning Models for Multi-Agent Debate
by: Liu, Chenxi, et al.
Published: (2026)
by: Liu, Chenxi, et al.
Published: (2026)
G-Zero: Self-Play for Open-Ended Generation from Zero Data
by: Huang, Chengsong, et al.
Published: (2026)
by: Huang, Chengsong, et al.
Published: (2026)
RelayLLM: Efficient Reasoning via Collaborative Decoding
by: Huang, Chengsong, et al.
Published: (2026)
by: Huang, Chengsong, et al.
Published: (2026)
Efficient Test-Time Scaling via Self-Calibration
by: Huang, Chengsong, et al.
Published: (2025)
by: Huang, Chengsong, et al.
Published: (2025)
Modality-Balancing Preference Optimization of Large Multimodal Models by Adversarial Negative Mining
by: Liu, Chenxi, et al.
Published: (2025)
by: Liu, Chenxi, et al.
Published: (2025)
Few-Shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt
by: Liu, Chenxi, et al.
Published: (2024)
by: Liu, Chenxi, et al.
Published: (2024)
Taming Overconfidence in LLMs: Reward Calibration in RLHF
by: Leng, Jixuan, et al.
Published: (2024)
by: Leng, Jixuan, et al.
Published: (2024)
TestExplora: Benchmarking LLMs for Proactive Bug Discovery via Repository-Level Test Generation
by: Liu, Steven, et al.
Published: (2026)
by: Liu, Steven, et al.
Published: (2026)
Training Data Efficiency in Multimodal Process Reward Models
by: Li, Jinyuan, et al.
Published: (2026)
by: Li, Jinyuan, et al.
Published: (2026)
Your Vision-Language Model Itself Is a Strong Filter: Towards High-Quality Instruction Tuning with Data Selection
by: Chen, Ruibo, et al.
Published: (2024)
by: Chen, Ruibo, et al.
Published: (2024)
Reinforcing Multimodal Reasoning Against Visual Degradation
by: Liu, Rui, et al.
Published: (2026)
by: Liu, Rui, et al.
Published: (2026)
Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation
by: Zheng, Tong, et al.
Published: (2025)
by: Zheng, Tong, et al.
Published: (2025)
Agentic AutoSurvey: Let LLMs Survey LLMs
by: Liu, Yixin, et al.
Published: (2025)
by: Liu, Yixin, et al.
Published: (2025)
ImAgent: A Unified Multimodal Agent Framework for Test-Time Scalable Image Generation
by: Wang, Kaishen, et al.
Published: (2025)
by: Wang, Kaishen, et al.
Published: (2025)
Improving Text-to-Image Generation with Input-Side Inference-Time Scaling
by: Chen, Ruibo, et al.
Published: (2025)
by: Chen, Ruibo, et al.
Published: (2025)
Position: Agentic Evolution is the Path to Evolving LLMs
by: Lin, Minhua, et al.
Published: (2026)
by: Lin, Minhua, et al.
Published: (2026)
A Watermark for Order-Agnostic Language Models
by: Chen, Ruibo, et al.
Published: (2024)
by: Chen, Ruibo, et al.
Published: (2024)
APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training
by: Qin, Jiarui, et al.
Published: (2025)
by: Qin, Jiarui, et al.
Published: (2025)
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
by: Dai, Runpeng, et al.
Published: (2025)
by: Dai, Runpeng, et al.
Published: (2025)
Automated Discovery of Test Oracles for Database Management Systems Using LLMs
by: Mang, Qiuyang, et al.
Published: (2025)
by: Mang, Qiuyang, et al.
Published: (2025)
A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution
by: Hu, Zhengmian, et al.
Published: (2024)
by: Hu, Zhengmian, et al.
Published: (2024)
Model Correlation Detection via Random Selection Probing
by: Chen, Ruibo, et al.
Published: (2025)
by: Chen, Ruibo, et al.
Published: (2025)
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories
by: Wei, Zhepei, et al.
Published: (2026)
by: Wei, Zhepei, et al.
Published: (2026)
A Multi-Task Evaluation of LLMs' Processing of Academic Text Input
by: Li, Tianyi, et al.
Published: (2025)
by: Li, Tianyi, et al.
Published: (2025)
From Lists to Emojis: How Format Bias Affects Model Alignment
by: Zhang, Xuanchang, et al.
Published: (2024)
by: Zhang, Xuanchang, et al.
Published: (2024)
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
by: Leng, Jixuan, et al.
Published: (2025)
by: Leng, Jixuan, et al.
Published: (2025)
Client-Centric Federated Adaptive Optimization
by: Sun, Jianhui, et al.
Published: (2025)
by: Sun, Jianhui, et al.
Published: (2025)
Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning
by: Guo, Junfeng, et al.
Published: (2025)
by: Guo, Junfeng, et al.
Published: (2025)
Guided Self-Evolving LLMs with Minimal Human Supervision
by: Yu, Wenhao, et al.
Published: (2025)
by: Yu, Wenhao, et al.
Published: (2025)
Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
by: Dai, Hankun, et al.
Published: (2025)
by: Dai, Hankun, et al.
Published: (2025)
Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
by: Xiong, Tianyi, et al.
Published: (2025)
by: Xiong, Tianyi, et al.
Published: (2025)
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
by: Zhang, Guibin, et al.
Published: (2025)
by: Zhang, Guibin, et al.
Published: (2025)
Toward Robust Multilingual Adaptation of LLMs for Low-Resource Languages
by: Li, Haolin, et al.
Published: (2025)
by: Li, Haolin, et al.
Published: (2025)
Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework
by: Guan, Zihan, et al.
Published: (2026)
by: Guan, Zihan, et al.
Published: (2026)
Improved Unbiased Watermark for Large Language Models
by: Chen, Ruibo, et al.
Published: (2025)
by: Chen, Ruibo, et al.
Published: (2025)
GEM: A Gym for Agentic LLMs
by: Liu, Zichen, et al.
Published: (2025)
by: Liu, Zichen, et al.
Published: (2025)
Do LLMs "Feel"? Emotion Circuits Discovery and Control
by: Wang, Chenxi, et al.
Published: (2025)
by: Wang, Chenxi, et al.
Published: (2025)
Privacy-Preserving LLMs Routing
by: Wu, Xidong, et al.
Published: (2026)
by: Wu, Xidong, et al.
Published: (2026)
Similar Items
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
by: Zheng, Tong, et al.
Published: (2025) -
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing
by: Zheng, Tong, et al.
Published: (2026) -
Learning from Self-Debate: Preparing Reasoning Models for Multi-Agent Debate
by: Liu, Chenxi, et al.
Published: (2026) -
G-Zero: Self-Play for Open-Ended Generation from Zero Data
by: Huang, Chengsong, et al.
Published: (2026) -
RelayLLM: Efficient Reasoning via Collaborative Decoding
by: Huang, Chengsong, et al.
Published: (2026)