Saved in:
| Main Authors: | Zhang, Yuxiang, Yang, Yuqi, Shu, Jiangming, Wen, Xinyan, Sang, Jitao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.06580 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluate-as-Action: Self-Evaluated Process Rewards for Retrieval-Augmented Agents
by: Shu, Jiangming, et al.
Published: (2026)
by: Shu, Jiangming, et al.
Published: (2026)
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
by: Zhang, Yuxiang, et al.
Published: (2024)
by: Zhang, Yuxiang, et al.
Published: (2024)
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
by: Zhang, Yuxiang, et al.
Published: (2025)
by: Zhang, Yuxiang, et al.
Published: (2025)
o1-Coder: an o1 Replication for Coding
by: Zhang, Yuxiang, et al.
Published: (2024)
by: Zhang, Yuxiang, et al.
Published: (2024)
Exploring the Privacy Protection Capabilities of Chinese Large Language Models
by: Yang, Yuqi, et al.
Published: (2024)
by: Yang, Yuqi, et al.
Published: (2024)
Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms
by: Wang, Yuhang, et al.
Published: (2025)
by: Wang, Yuhang, et al.
Published: (2025)
A Disguised Wolf Is More Harmful Than a Toothless Tiger: Adaptive Malicious Code Injection Backdoor Attack Leveraging User Behavior as Triggers
by: Wu, Shangxi, et al.
Published: (2024)
by: Wu, Shangxi, et al.
Published: (2024)
CSPO: Alleviating Reward Ambiguity for Structured Table-to-LaTeX Generation
by: Yang, Yunfan, et al.
Published: (2026)
by: Yang, Yunfan, et al.
Published: (2026)
Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines
by: Wang, Yuhang, et al.
Published: (2025)
by: Wang, Yuhang, et al.
Published: (2025)
Positional Failures in Long-Context LLMs: A Blind Spot in Reasoning Benchmarks
by: Zhang, Chuyifei, et al.
Published: (2026)
by: Zhang, Chuyifei, et al.
Published: (2026)
How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation
by: Zhu, Lixi, et al.
Published: (2024)
by: Zhu, Lixi, et al.
Published: (2024)
ReInAgent: A Context-Aware GUI Agent Enabling Human-in-the-Loop Mobile Task Navigation
by: Jia, Haitao, et al.
Published: (2025)
by: Jia, Haitao, et al.
Published: (2025)
Named Entity Recognition in COVID-19 tweets with Entity Knowledge Augmentation
by: Zhang, Xuankang, et al.
Published: (2025)
by: Zhang, Xuankang, et al.
Published: (2025)
GUITester: Enabling GUI Agents for Exploratory Defect Discovery
by: Gao, Yifei, et al.
Published: (2026)
by: Gao, Yifei, et al.
Published: (2026)
Inference-Time Rule Eraser: Fair Recognition via Distilling and Removing Biased Rules
by: Zhang, Yi, et al.
Published: (2024)
by: Zhang, Yi, et al.
Published: (2024)
ITDR: An Instruction Tuning Dataset for Enhancing Large Language Models in Recommendations
by: Liu, Zekun, et al.
Published: (2025)
by: Liu, Zekun, et al.
Published: (2025)
Unifying Perplexing Behaviors in Modified BP Attributions through Alignment Perspective
by: Zheng, Guanhua, et al.
Published: (2025)
by: Zheng, Guanhua, et al.
Published: (2025)
A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems
by: Zhu, Lixi, et al.
Published: (2024)
by: Zhu, Lixi, et al.
Published: (2024)
Humanoid Agent via Embodied Chain-of-Action Reasoning with Multimodal Foundation Models for Zero-Shot Loco-Manipulation
by: Wen, Congcong, et al.
Published: (2025)
by: Wen, Congcong, et al.
Published: (2025)
WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis
by: Gao, Yifei, et al.
Published: (2025)
by: Gao, Yifei, et al.
Published: (2025)
Adaptive Federated Distillation for Multi-Domain Non-IID Textual Data
by: Xiao, Jiahao, et al.
Published: (2025)
by: Xiao, Jiahao, et al.
Published: (2025)
Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation
by: Wang, Junyang, et al.
Published: (2025)
by: Wang, Junyang, et al.
Published: (2025)
Internalizing LLM Reasoning via Discovery and Replay of Latent Actions
by: Shi, Zhenning, et al.
Published: (2026)
by: Shi, Zhenning, et al.
Published: (2026)
Don't Command, Cultivate: An Exploratory Study of System-2 Alignment
by: Wang, Yuhang, et al.
Published: (2024)
by: Wang, Yuhang, et al.
Published: (2024)
Joint Reward Modeling: Internalizing Chain-of-Thought for Efficient Visual Reward Models
by: Yang, Yankai, et al.
Published: (2026)
by: Yang, Yankai, et al.
Published: (2026)
Membership Inference Attack against Large Language Model-based Recommendation Systems: A New Distillation-based Paradigm
by: Cuihong, Li, et al.
Published: (2025)
by: Cuihong, Li, et al.
Published: (2025)
KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions
by: Zhu, Yanxu, et al.
Published: (2024)
by: Zhu, Yanxu, et al.
Published: (2024)
GUITestScape: Towards Open-set Evaluation on Exploratory GUI Testing
by: Chen, Xiaoyi, et al.
Published: (2026)
by: Chen, Xiaoyi, et al.
Published: (2026)
Markov Chain of Thought for Efficient Mathematical Reasoning
by: Yang, Wen, et al.
Published: (2024)
by: Yang, Wen, et al.
Published: (2024)
Real-Time Reasoning Agents in Evolving Environments
by: Wen, Yule, et al.
Published: (2025)
by: Wen, Yule, et al.
Published: (2025)
Language-assisted Vision Model Debugger: A Sample-Free Approach to Finding and Fixing Bugs
by: Jiang, Chaoquan, et al.
Published: (2023)
by: Jiang, Chaoquan, et al.
Published: (2023)
Backdoor for Debias: Mitigating Model Bias with Backdoor Attack-based Artificial Bias
by: Wu, Shangxi, et al.
Published: (2023)
by: Wu, Shangxi, et al.
Published: (2023)
You Only Look at Screens: Multimodal Chain-of-Action Agents
by: Zhang, Zhuosheng, et al.
Published: (2023)
by: Zhang, Zhuosheng, et al.
Published: (2023)
Does Chain-of-Thought Reasoning Help Mobile GUI Agent? An Empirical Study
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Chain-of-Evidence Multimodal Reasoning for Few-shot Temporal Action Localization
by: Qi, Mengshi, et al.
Published: (2025)
by: Qi, Mengshi, et al.
Published: (2025)
PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning
by: Hu, Kun, et al.
Published: (2025)
by: Hu, Kun, et al.
Published: (2025)
Reinforcing Language Agents via Policy Optimization with Action Decomposition
by: Wen, Muning, et al.
Published: (2024)
by: Wen, Muning, et al.
Published: (2024)
Spatial Reasoning and Planning for Deep Embodied Agents
by: Ishida, Shu
Published: (2024)
by: Ishida, Shu
Published: (2024)
Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning
by: Hong, Jialiang, et al.
Published: (2025)
by: Hong, Jialiang, et al.
Published: (2025)
Action-Free Reasoning for Policy Generalization
by: Clark, Jaden, et al.
Published: (2025)
by: Clark, Jaden, et al.
Published: (2025)
Similar Items
-
Evaluate-as-Action: Self-Evaluated Process Rewards for Retrieval-Augmented Agents
by: Shu, Jiangming, et al.
Published: (2026) -
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
by: Zhang, Yuxiang, et al.
Published: (2024) -
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks
by: Zhang, Yuxiang, et al.
Published: (2025) -
o1-Coder: an o1 Replication for Coding
by: Zhang, Yuxiang, et al.
Published: (2024) -
Exploring the Privacy Protection Capabilities of Chinese Large Language Models
by: Yang, Yuqi, et al.
Published: (2024)