Saved in:
| Main Authors: | Huang, Junjie, Qin, Jiarui, Yin, Di, Liu, Weiwen, Yu, Yong, Sun, Xing, Zhang, Weinan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.03075 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training
by: Qin, Jiarui, et al.
Published: (2025)
by: Qin, Jiarui, et al.
Published: (2025)
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
by: Zhang, Kangning, et al.
Published: (2025)
by: Zhang, Kangning, et al.
Published: (2025)
Position: The Real Barrier to LLM Agent Usability is Agentic ROI
by: Liu, Weiwen, et al.
Published: (2025)
by: Liu, Weiwen, et al.
Published: (2025)
SkillMAS: Skill Co-Evolution with LLM-based Multi-Agent System
by: Pan, Shuai, et al.
Published: (2026)
by: Pan, Shuai, et al.
Published: (2026)
D2K: Turning Historical Data into Retrievable Knowledge for Recommender Systems
by: Qin, Jiarui, et al.
Published: (2024)
by: Qin, Jiarui, et al.
Published: (2024)
Beyond Graph Convolution: Multimodal Recommendation with Topology-aware MLPs
by: Huang, Junjie, et al.
Published: (2024)
by: Huang, Junjie, et al.
Published: (2024)
CATArena: Evaluating Evolutionary Capabilities of Code Agents via Iterative Tournaments
by: Fu, Lingyue, et al.
Published: (2025)
by: Fu, Lingyue, et al.
Published: (2025)
CoLD: Counterfactually-Guided Length Debiasing for Process Reward Models in Mathematical Reasoning
by: Zheng, Congmin, et al.
Published: (2025)
by: Zheng, Congmin, et al.
Published: (2025)
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
by: Zhang, Charlie, et al.
Published: (2025)
by: Zhang, Charlie, et al.
Published: (2025)
Automatically Benchmarking LLM Code Agents through Agent-Driven Annotation and Evaluation
by: Fu, Lingyue, et al.
Published: (2025)
by: Fu, Lingyue, et al.
Published: (2025)
Learning to Reason as Action Abstractions with Scalable Mid-Training RL
by: Zhang, Shenao, et al.
Published: (2025)
by: Zhang, Shenao, et al.
Published: (2025)
Skills on the Fly: Test-Time Adaptive Skill Synthesis for LLM Agents
by: Wang, Jingxing, et al.
Published: (2026)
by: Wang, Jingxing, et al.
Published: (2026)
A Survey of LLM-based Deep Search Agents: Paradigm, Optimization, Evaluation, and Challenges
by: Xi, Yunjia, et al.
Published: (2025)
by: Xi, Yunjia, et al.
Published: (2025)
TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization
by: Li, Peiji, et al.
Published: (2026)
by: Li, Peiji, et al.
Published: (2026)
A Survey on LLM Mid-Training
by: Tu, Chengying, et al.
Published: (2025)
by: Tu, Chengying, et al.
Published: (2025)
LogitsCoder: Towards Efficient Chain-of-Thought Path Search via Logits Preference Decoding for Code Generation
by: Chen, Jizheng, et al.
Published: (2026)
by: Chen, Jizheng, et al.
Published: (2026)
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
by: Yu, Hongli, et al.
Published: (2025)
by: Yu, Hongli, et al.
Published: (2025)
PoLi-RL: A Point-to-List Reinforcement Learning Framework for Conditional Semantic Textual Similarity
by: Song, Zixin, et al.
Published: (2025)
by: Song, Zixin, et al.
Published: (2025)
Fast, Slow, and Tool-augmented Thinking for LLMs: A Review
by: Jia, Xinda, et al.
Published: (2025)
by: Jia, Xinda, et al.
Published: (2025)
Evolutionary Perspectives on the Evaluation of LLM-Based AI Agents: A Comprehensive Survey
by: Zhu, Jiachen, et al.
Published: (2025)
by: Zhu, Jiachen, et al.
Published: (2025)
Unleashing the Potential of Multi-Channel Fusion in Retrieval for Personalized Recommendations
by: Huang, Junjie, et al.
Published: (2024)
by: Huang, Junjie, et al.
Published: (2024)
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning
by: Qiu, Zhaopeng, et al.
Published: (2026)
by: Qiu, Zhaopeng, et al.
Published: (2026)
NL-Debugging: Exploiting Natural Language as an Intermediate Representation for Code Debugging
by: Zhang, Weiming, et al.
Published: (2025)
by: Zhang, Weiming, et al.
Published: (2025)
ColorBench: Benchmarking Mobile Agents with Graph-Structured Framework for Complex Long-Horizon Tasks
by: Song, Yuanyi, et al.
Published: (2025)
by: Song, Yuanyi, et al.
Published: (2025)
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
by: Zeng, Xingshan, et al.
Published: (2025)
by: Zeng, Xingshan, et al.
Published: (2025)
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
by: Qi, Zehan, et al.
Published: (2024)
by: Qi, Zehan, et al.
Published: (2024)
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
by: Zhu, Xiaoxuan, et al.
Published: (2025)
by: Zhu, Xiaoxuan, et al.
Published: (2025)
ReDit: Reward Dithering for Improved LLM Policy Optimization
by: Wei, Chenxing, et al.
Published: (2025)
by: Wei, Chenxing, et al.
Published: (2025)
Guided Profile Generation Improves Personalization with LLMs
by: Zhang, Jiarui
Published: (2024)
by: Zhang, Jiarui
Published: (2024)
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
by: Zhang, Dan, et al.
Published: (2024)
by: Zhang, Dan, et al.
Published: (2024)
CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution
by: Yang, Shidong, et al.
Published: (2026)
by: Yang, Shidong, et al.
Published: (2026)
Quick on the Uptake: Eliciting Implicit Intents from Human Demonstrations for Personalized Mobile-Use Agents
by: Wu, Zheng, et al.
Published: (2025)
by: Wu, Zheng, et al.
Published: (2025)
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding
by: Li, Yuhang, et al.
Published: (2025)
by: Li, Yuhang, et al.
Published: (2025)
Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
by: Xing, Junjie, et al.
Published: (2024)
by: Xing, Junjie, et al.
Published: (2024)
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning
by: Shen, Jingyan, et al.
Published: (2025)
by: Shen, Jingyan, et al.
Published: (2025)
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
by: Xi, Zhiheng, et al.
Published: (2025)
by: Xi, Zhiheng, et al.
Published: (2025)
Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning
by: Wu, Zheng, et al.
Published: (2026)
by: Wu, Zheng, et al.
Published: (2026)
APT: Improving Specialist LLM Performance with Weakness Case Acquisition and Iterative Preference Training
by: Rao, Jun, et al.
Published: (2025)
by: Rao, Jun, et al.
Published: (2025)
How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants
by: Feng, Xueyang, et al.
Published: (2026)
by: Feng, Xueyang, et al.
Published: (2026)
A Comprehensive Survey on Retrieval Methods in Recommender Systems
by: Huang, Junjie, et al.
Published: (2024)
by: Huang, Junjie, et al.
Published: (2024)
Similar Items
-
APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training
by: Qin, Jiarui, et al.
Published: (2025) -
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
by: Zhang, Kangning, et al.
Published: (2025) -
Position: The Real Barrier to LLM Agent Usability is Agentic ROI
by: Liu, Weiwen, et al.
Published: (2025) -
SkillMAS: Skill Co-Evolution with LLM-based Multi-Agent System
by: Pan, Shuai, et al.
Published: (2026) -
D2K: Turning Historical Data into Retrievable Knowledge for Recommender Systems
by: Qin, Jiarui, et al.
Published: (2024)