:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Guo, Weiyang, Shi, Zesheng, Zhao, Liye, Ma, Jiayuan, Zhu, Zeen, He, Junxian, Zhang, Min, Li, Jing
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.09455
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Backdoors in RLVR: Jailbreak Backdoors in LLMs From Verifiable Reward
by: Guo, Weiyang, et al.
Published: (2026)

PruneTIR: Inference-Time Tool Call Pruning for Effective yet Efficient Tool-Integrated Reasoning
by: Zhang, Luan, et al.
Published: (2026)

Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
by: Wu, Junde, et al.
Published: (2025)

MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching
by: Qu, Changle, et al.
Published: (2026)

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
by: Zeng, Weihao, et al.
Published: (2024)

Jailbreak-R1: Exploring the Jailbreak Capabilities of LLMs via Reinforcement Learning
by: Guo, Weiyang, et al.
Published: (2025)

Skill Weaving: Efficient LLM Improvement via Modular Skillpacks
by: Li, Zhuo, et al.
Published: (2026)

DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management
by: Su, Xuerui, et al.
Published: (2025)

CharTool: Tool-Integrated Visual Reasoning for Chart Understanding
by: Zhang, Situo, et al.
Published: (2026)

Non-myopic Generation of Language Models for Reasoning and Planning
by: Ma, Chang, et al.
Published: (2024)

Safety Alignment via Constrained Knowledge Unlearning
by: Shi, Zesheng, et al.
Published: (2025)

DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
by: Chen, Aili, et al.
Published: (2026)

Dr. RTL: Autonomous Agentic RTL Optimization through Tool-Grounded Self-Improvement
by: Fang, Wenji, et al.
Published: (2026)

JT-DA: Enhancing Data Analysis with Tool-Integrated Table Reasoning Large Language Models
by: Chi, Ce, et al.
Published: (2025)

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation
by: Huang, Kexin, et al.
Published: (2026)

Guided by Trajectories: Repairing and Rewarding Tool-Use Trajectories for Tool-Integrated Reasoning
by: Gong, Siyu, et al.
Published: (2026)

Dissecting Tool-Integrated Reasoning: An Empirical Study and Analysis
by: Zhao, Yufeng, et al.
Published: (2025)

Team-Based Self-Play With Dual Adaptive Weighting for Fine-Tuning LLMs
by: Li, Wu, et al.
Published: (2026)

S$^2$-MLLM: Boosting Spatial Reasoning Capability of MLLMs for 3D Visual Grounding with Structural Guidance
by: Xu, Beining, et al.
Published: (2025)

MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming
by: Guo, Weiyang, et al.
Published: (2025)

AnomalyClaw: A Universal Visual Anomaly Detection Agent via Tool-Grounded Refutation
by: Jiang, Xi, et al.
Published: (2026)

Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework
by: Ma, Xilai, et al.
Published: (2026)

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning
by: Chen, Jiawei, et al.
Published: (2025)

MedCoAct: Confidence-Aware Multi-Agent Collaboration for Complete Clinical Decision
by: Zheng, Hongjie, et al.
Published: (2025)

Faithful-First Reasoning, Planning, and Acting for Multimodal LLMs
by: Li, Junxian, et al.
Published: (2025)

From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning
by: Huang, Yuzhen, et al.
Published: (2025)

Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models
by: Hong, Zesheng, et al.
Published: (2026)

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
by: Li, Junlong, et al.
Published: (2025)

Evolutionary Discovery of Heuristic Policies for Traffic Signal Control
by: Wang, Ruibing, et al.
Published: (2025)

Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
by: Chen, Zhipeng, et al.
Published: (2025)

EigentSearch-Q+: Enhancing Deep Research Agents with Structured Reasoning Tools
by: Zhang, Boer, et al.
Published: (2026)

Understanding Tool-Integrated Reasoning
by: Lin, Heng, et al.
Published: (2025)

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
by: Wu, Feijie, et al.
Published: (2025)

Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning
by: Shen, Xintian, et al.
Published: (2026)

DeepTool: Scaling Interleaved Deliberation in Tool-Integrated Reasoning via Process-Supervised Reinforcement Learning
by: He, Yang, et al.
Published: (2026)

CAREAgent: Clinical Agent with Structured Reasoning and Tool-Integrated for Order Generation
by: Hou, Ruihui, et al.
Published: (2026)

ToolMind Technical Report: A Large-Scale, Reasoning-Enhanced Tool-Use Dataset
by: Yang, Chen, et al.
Published: (2025)

JudgeSQL: Reasoning over SQL Candidates with Weighted Consensus Tournament
by: Bai, Jiayuan, et al.
Published: (2025)

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth
by: Zeng, Weihao, et al.
Published: (2026)

User-Aware Conditional Generative Total Correlation Learning for Multi-Modal Recommendation
by: Du, Jing, et al.
Published: (2026)