:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Zehao, Jin, Shilong, Cao, Zhao, Wang, Lanjun
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2605.23414
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Reasoning-targeted Jailbreak Attacks on Large Reasoning Models via Semantic Triggers and Psychological Framing
by: Wang, Zehao, et al.
Published: (2026)

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents
by: Wang, Zehong, et al.
Published: (2026)

SE-GA: Memory-Augmented Self-Evolution for GUI Agents
by: Jin, Shilong, et al.
Published: (2026)

When Refusals Fail: Unstable Safety Mechanisms in Long-Context LLM Agents
by: Hadeliya, Tsimur, et al.
Published: (2025)

Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification
by: Wu, Zehao, et al.
Published: (2025)

When Correct Beliefs Collapse: Epistemic Resilience of LLMs under Clinical Pressure
by: Xiao, Boyu, et al.
Published: (2026)

Multi-Scenario Combination Based on Multi-Agent Reinforcement Learning to Optimize the Advertising Recommendation System
by: Zhao, Yang, et al.
Published: (2024)

When Softmax Fails at the Top: Extreme Value Corrections for InfoNCE
by: Erol, Melihcan, et al.
Published: (2026)

MemFail: Stress-Testing Failure Modes of LLM Memory Systems
by: Garg, Ishir, et al.
Published: (2026)

When LLM Reward Design Fails: Diagnostic-Driven Refinement for Sparse Structured RL
by: Wang, Youting, et al.
Published: (2026)

PIVOT: Bridging Planning and Execution in LLM Agents via Trajectory Refinement
by: Zhang, Tuo, et al.
Published: (2026)

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs
by: Zeng, Yifan, et al.
Published: (2026)

Frequency Matters: When Time Series Foundation Models Fail Under Spectral Shift
by: Wang, Tianze, et al.
Published: (2025)

Agentic Unlearning: When LLM Agent Meets Machine Unlearning
by: Wang, Bin, et al.
Published: (2026)

NK-GAD: Neighbor Knowledge-Enhanced Unsupervised Graph Anomaly Detection
by: Wang, Zehao, et al.
Published: (2026)

When LLM Judge Scores Look Good but Best-of-N Decisions Fail
by: Landesberg, Eddie
Published: (2026)

Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity
by: Yang, Yingxuan, et al.
Published: (2026)

Active Learning for Communication Structure Optimization in LLM-Based Multi-Agent Systems
by: Yang, Huchen, et al.
Published: (2026)

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning
by: Liu, Zehao, et al.
Published: (2026)

When Counterfactual Reasoning Fails: Chaos and Real-World Complexity
by: Aalaila, Yahya, et al.
Published: (2025)

On the Importance of Task Complexity in Evaluating LLM-Based Multi-Agent Systems
by: Tang, Bohan, et al.
Published: (2025)

Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems
by: Wang, Zhao, et al.
Published: (2025)

CRAKEN: Cybersecurity LLM Agent with Knowledge-Based Execution
by: Shao, Minghao, et al.
Published: (2025)

STeCa: Step-level Trajectory Calibration for LLM Agent Learning
by: Wang, Hanlin, et al.
Published: (2025)

Topological Structure Learning Should Be A Research Priority for LLM-Based Multi-Agent Systems
by: Yang, Jiaxi, et al.
Published: (2025)

MPO: Boosting LLM Agents with Meta Plan Optimization
by: Xiong, Weimin, et al.
Published: (2025)

Agent-Oriented Planning in Multi-Agent Systems
by: Li, Ao, et al.
Published: (2024)

When Continual Learning Moves to Memory: A Study of Experience Reuse in LLM Agents
by: Hu, Qisheng, et al.
Published: (2026)

When Will It Fail?: Anomaly to Prompt for Forecasting Future Anomalies in Time Series
by: Park, Min-Yeong, et al.
Published: (2025)

When Do Multi-Agent Systems Outperform? Analysing the Learning Efficiency of Agentic Systems
by: Su, Junwei, et al.
Published: (2026)

Intelligent Assistants for the Semiconductor Failure Analysis with LLM-Based Planning Agents
by: Dobrovsky, Aline, et al.
Published: (2025)

Universe Routing: Why Self-Evolving Agents Need Epistemic Control
by: Wang, Zhaohui Geoffrey
Published: (2026)

When Validation Fails: Cross-Institutional Blood Pressure Prediction and the Limits of Electronic Health Record-Based Models
by: Azam, Md Basit, et al.
Published: (2025)

Synthetic Error Injection Fails to Elicit Self-Correction In Language Models
by: Wu, David X., et al.
Published: (2025)

MAVEN: Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing
by: Yao, Yinsheng, et al.
Published: (2026)

DiFR: Inference Verification Despite Nondeterminism
by: Karvonen, Adam, et al.
Published: (2025)

TruthFlow: Truthful LLM Generation via Representation Flow Correction
by: Wang, Hanyu, et al.
Published: (2025)

When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift
by: Vogt-Lowell, Kevin, et al.
Published: (2026)

When Chain-of-Thought Fails, the Solution Hides in the Hidden States
by: Mehrafarin, Houman, et al.
Published: (2026)

Epistemic Deep Learning: Enabling Machine Learning Models to Know When They Do Not Know
by: Manchingal, Shireen Kudukkil
Published: (2025)