Saved in:
| Main Authors: | He, Zhitao, Liu, Zijun, Li, Peng, Fung, Yi R., Yan, Ming, Zhang, Ji, Huang, Fei, Liu, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.14496 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
by: Yang, Zonghan, et al.
Published: (2024)
by: Yang, Zonghan, et al.
Published: (2024)
Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration
by: Liu, Zijun, et al.
Published: (2025)
by: Liu, Zijun, et al.
Published: (2025)
RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind
by: He, Zhitao, et al.
Published: (2026)
by: He, Zhitao, et al.
Published: (2026)
ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education
by: He, Zhitao, et al.
Published: (2025)
by: He, Zhitao, et al.
Published: (2025)
CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic
by: Zhang, Yaocheng, et al.
Published: (2025)
by: Zhang, Yaocheng, et al.
Published: (2025)
MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence Calibration
by: He, Zhitao, et al.
Published: (2025)
by: He, Zhitao, et al.
Published: (2025)
On Stable Long-Form Generation: Benchmarking and Mitigating Length Volatility
by: He, Zhitao, et al.
Published: (2026)
by: He, Zhitao, et al.
Published: (2026)
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
by: Liu, Zijun, et al.
Published: (2024)
by: Liu, Zijun, et al.
Published: (2024)
MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness
by: Huang, Junsheng, et al.
Published: (2025)
by: Huang, Junsheng, et al.
Published: (2025)
CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents
by: Liu, Jiayu, et al.
Published: (2025)
by: Liu, Jiayu, et al.
Published: (2025)
MARS-SQL: A multi-agent reinforcement learning framework for Text-to-SQL
by: Yang, Haolin, et al.
Published: (2025)
by: Yang, Haolin, et al.
Published: (2025)
From Reasoning to Agentic: Credit Assignment in Reinforcement Learning for Large Language Models
by: Zhang, Chenchen
Published: (2026)
by: Zhang, Chenchen
Published: (2026)
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?
by: He, Zhitao, et al.
Published: (2025)
by: He, Zhitao, et al.
Published: (2025)
Reducing Credit Assignment Variance via Counterfactual Reasoning Paths
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
Re-ReST: Reflection-Reinforced Self-Training for Language Agents
by: Dou, Zi-Yi, et al.
Published: (2024)
by: Dou, Zi-Yi, et al.
Published: (2024)
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration
by: Liu, Zijun, et al.
Published: (2023)
by: Liu, Zijun, et al.
Published: (2023)
Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
by: He, Jiayi, et al.
Published: (2024)
by: He, Jiayi, et al.
Published: (2024)
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation
by: Hu, Xiaomeng, et al.
Published: (2026)
by: Hu, Xiaomeng, et al.
Published: (2026)
RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection
by: Huang, Yiming, et al.
Published: (2025)
by: Huang, Yiming, et al.
Published: (2025)
CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
by: Huang, Yuchen, et al.
Published: (2025)
by: Huang, Yuchen, et al.
Published: (2025)
Beyond Uniform Credit: Causal Credit Assignment for Policy Optimization
by: Khandoga, Mykola, et al.
Published: (2026)
by: Khandoga, Mykola, et al.
Published: (2026)
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments
by: Chen, Junzhe, et al.
Published: (2024)
by: Chen, Junzhe, et al.
Published: (2024)
Towards Unified Alignment Between Agents, Humans, and Environment
by: Yang, Zonghan, et al.
Published: (2024)
by: Yang, Zonghan, et al.
Published: (2024)
Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning
by: Lei, Xuanyu, et al.
Published: (2025)
by: Lei, Xuanyu, et al.
Published: (2025)
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
by: Guo, Yiran, et al.
Published: (2025)
by: Guo, Yiran, et al.
Published: (2025)
MICA: Multi-granularity Intertemporal Credit Assignment for Long-Horizon Emotional Support Dialogue
by: Zhang, Naifan, et al.
Published: (2026)
by: Zhang, Naifan, et al.
Published: (2026)
Reasoning Path Divergence: A New Metric and Curation Strategy to Unlock LLM Diverse Thinking
by: Ju, Feng, et al.
Published: (2025)
by: Ju, Feng, et al.
Published: (2025)
Data Mixing Agent: Learning to Re-weight Domains for Continual Pre-training
by: Yang, Kailai, et al.
Published: (2025)
by: Yang, Kailai, et al.
Published: (2025)
Rewarding Beliefs, Not Actions: Consistency-Guided Credit Assignment for Long-Horizon Agents
by: Tang, Wenjie, et al.
Published: (2026)
by: Tang, Wenjie, et al.
Published: (2026)
DACA-GRPO: Denoising-Aware Credit Assignment for Reinforcement Learning in Diffusion Language Models
by: Monsefi, Amin Karimi, et al.
Published: (2026)
by: Monsefi, Amin Karimi, et al.
Published: (2026)
Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning
by: Li, Ziheng, et al.
Published: (2026)
by: Li, Ziheng, et al.
Published: (2026)
Self-Induced Outcome Potential: Turn-Level Credit Assignment for Agents without Verifiers
by: Hu, Senkang, et al.
Published: (2026)
by: Hu, Senkang, et al.
Published: (2026)
From Reasoning Chains to Verifiable Subproblems: Curriculum Reinforcement Learning Enables Credit Assignment for LLM Reasoning
by: Jiang, Xitai, et al.
Published: (2026)
by: Jiang, Xitai, et al.
Published: (2026)
CAPO: Towards Enhancing LLM Reasoning through Generative Credit Assignment
by: Xie, Guofu, et al.
Published: (2025)
by: Xie, Guofu, et al.
Published: (2025)
RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents
by: Li, Mingchen, et al.
Published: (2026)
by: Li, Mingchen, et al.
Published: (2026)
APEX-Searcher: Refining Credit Assignment with Subgoaling for Agentic Retrieval-Augmented Generation
by: Chen, Kun, et al.
Published: (2026)
by: Chen, Kun, et al.
Published: (2026)
InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning
by: Yang, Matthew Y. R., et al.
Published: (2026)
by: Yang, Matthew Y. R., et al.
Published: (2026)
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
by: Wang, Junyang, et al.
Published: (2024)
by: Wang, Junyang, et al.
Published: (2024)
Exploiting Tree Structure for Credit Assignment in RL Training of LLMs
by: Tran, Hieu, et al.
Published: (2025)
by: Tran, Hieu, et al.
Published: (2025)
Reducing Distraction in Long-Context Language Models by Focused Learning
by: Wu, Zijun, et al.
Published: (2024)
by: Wu, Zijun, et al.
Published: (2024)
Similar Items
-
ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy
by: Yang, Zonghan, et al.
Published: (2024) -
Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration
by: Liu, Zijun, et al.
Published: (2025) -
RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind
by: He, Zhitao, et al.
Published: (2026) -
ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education
by: He, Zhitao, et al.
Published: (2025) -
CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic
by: Zhang, Yaocheng, et al.
Published: (2025)