Saved in:
| Main Authors: | Shao, Jiaqi, Lin, Yuxiang, Lohani, Munish Prasad, Miao, Yufeng, Luo, Bing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.22391 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents
by: Shao, Jiaqi, et al.
Published: (2025)
by: Shao, Jiaqi, et al.
Published: (2025)
MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration
by: Lu, Siyuan, et al.
Published: (2024)
by: Lu, Siyuan, et al.
Published: (2024)
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
by: Shao, Jiaqi, et al.
Published: (2024)
by: Shao, Jiaqi, et al.
Published: (2024)
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
by: Trinh, Tu, et al.
Published: (2026)
by: Trinh, Tu, et al.
Published: (2026)
Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents
by: Wang, Yufeng
Published: (2026)
by: Wang, Yufeng
Published: (2026)
Do Agents Know What They Can't Do? Evaluating Feasibility Awareness in Tool-Using Agents
by: Cheng, Liang, et al.
Published: (2026)
by: Cheng, Liang, et al.
Published: (2026)
From Knowing to Doing: A Memory-Controlled Benchmark for LLM Trading Agents on Stock Markets
by: Zhu, Taojie, et al.
Published: (2026)
by: Zhu, Taojie, et al.
Published: (2026)
When Stored Evidence Stops Being Usable: Scale-Conditioned Evaluation of Agent Memory
by: Shao, Jiaqi, et al.
Published: (2026)
by: Shao, Jiaqi, et al.
Published: (2026)
ICA: Information-Aware Credit Assignment for Visually Grounded Long-Horizon Information-Seeking Agents
by: Pang, Cong, et al.
Published: (2026)
by: Pang, Cong, et al.
Published: (2026)
InfoAgent: Advancing Autonomous Information-Seeking Agents
by: Zhang, Gongrui, et al.
Published: (2025)
by: Zhang, Gongrui, et al.
Published: (2025)
Epistemic Deep Learning: Enabling Machine Learning Models to Know When They Do Not Know
by: Manchingal, Shireen Kudukkil
Published: (2025)
by: Manchingal, Shireen Kudukkil
Published: (2025)
Know When to Trust the Skill: Delayed Appraisal and Epistemic Vigilance for Single-Agent LLMs
by: Unlu, Eren
Published: (2026)
by: Unlu, Eren
Published: (2026)
Epistemic Artificial Intelligence is Essential for Machine Learning Models to Truly 'Know When They Do Not Know'
by: Manchingal, Shireen Kudukkil, et al.
Published: (2025)
by: Manchingal, Shireen Kudukkil, et al.
Published: (2025)
K^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control
by: Wu, Zhe, et al.
Published: (2026)
by: Wu, Zhe, et al.
Published: (2026)
Do Large Language Models Know What They Don't Know? Kalshibench: A New Benchmark for Evaluating Epistemic Calibration via Prediction Markets
by: Nel, Lukas
Published: (2025)
by: Nel, Lukas
Published: (2025)
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking
by: Lee, Ka Yiu, et al.
Published: (2026)
by: Lee, Ka Yiu, et al.
Published: (2026)
Exploring Information Seeking Agent Consolidation
by: Yan, Guochen, et al.
Published: (2026)
by: Yan, Guochen, et al.
Published: (2026)
Recovering Physical Dynamics from Discrete Observations via Intrinsic Differential Consistency
by: Luo, Yuxiang, et al.
Published: (2026)
by: Luo, Yuxiang, et al.
Published: (2026)
CaRT: Teaching LLM Agents to Know When They Know Enough
by: Liu, Grace, et al.
Published: (2025)
by: Liu, Grace, et al.
Published: (2025)
GroundAct: Can LLM Agents Ground Actions in Environmental States?
by: Wang, Zixuan, et al.
Published: (2025)
by: Wang, Zixuan, et al.
Published: (2025)
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
by: Zhu, Yuqi, et al.
Published: (2024)
by: Zhu, Yuqi, et al.
Published: (2024)
Federated Unlearning: a Perspective of Stability and Fairness
by: Shao, Jiaqi, et al.
Published: (2024)
by: Shao, Jiaqi, et al.
Published: (2024)
A Functionality-Grounded Benchmark for Evaluating Web Agents in E-commerce Domains
by: Zhang, Xianren, et al.
Published: (2025)
by: Zhang, Xianren, et al.
Published: (2025)
SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes
by: Li, Kuan, et al.
Published: (2026)
by: Li, Kuan, et al.
Published: (2026)
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
by: Lin, Jingyang, et al.
Published: (2026)
by: Lin, Jingyang, et al.
Published: (2026)
When Agents Overtrust Environmental Evidence: An Extensible Agentic Framework for Benchmarking Evidence-Grounding Defects in LLM Agents
by: Sheng, Strick, et al.
Published: (2026)
by: Sheng, Strick, et al.
Published: (2026)
Know Your Intent: An Autonomous Multi-Perspective LLM Agent Framework for DeFi User Transaction Intent Mining
by: Mao, Qian'ang, et al.
Published: (2025)
by: Mao, Qian'ang, et al.
Published: (2025)
Know the Ropes: A Heuristic Strategy for LLM-based Multi-Agent System Design
by: Li, Zhenkun, et al.
Published: (2025)
by: Li, Zhenkun, et al.
Published: (2025)
Do Large Language Models Know How Much They Know?
by: Prato, Gabriele, et al.
Published: (2025)
by: Prato, Gabriele, et al.
Published: (2025)
Architecting Trust in Artificial Epistemic Agents
by: Marchal, Nahema, et al.
Published: (2026)
by: Marchal, Nahema, et al.
Published: (2026)
What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding
by: Liu, Siyuan, et al.
Published: (2026)
by: Liu, Siyuan, et al.
Published: (2026)
MLLMs Know When Before Speaking: Revealing and Recovering Temporal Grounding via Attention Cues
by: Du, Dazhao, et al.
Published: (2026)
by: Du, Dazhao, et al.
Published: (2026)
An Epistemic Perspective on Agent Awareness
by: Naumov, Pavel, et al.
Published: (2025)
by: Naumov, Pavel, et al.
Published: (2025)
IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review
by: Ma, Fengbo, et al.
Published: (2026)
by: Ma, Fengbo, et al.
Published: (2026)
AgentEscapeBench: Evaluating Out-of-Domain Tool-Grounded Reasoning in LLM Agents
by: Guo, Zhengkang, et al.
Published: (2026)
by: Guo, Zhengkang, et al.
Published: (2026)
Voluntary Collusion with Secret Tools in Competing LLM Agents
by: Zeng, Xijie, et al.
Published: (2026)
by: Zeng, Xijie, et al.
Published: (2026)
Semantic Laundering in AI Agent Architectures: Why Tool Boundaries Do Not Confer Epistemic Warrant
by: Romanchuk, Oleg, et al.
Published: (2026)
by: Romanchuk, Oleg, et al.
Published: (2026)
Agents Need Not Know Their Purpose
by: Garcia, Paulo
Published: (2024)
by: Garcia, Paulo
Published: (2024)
ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents
by: Lai, Yuxiang, et al.
Published: (2026)
by: Lai, Yuxiang, et al.
Published: (2026)
Benchmark Test-Time Scaling of General LLM Agents
by: Li, Xiaochuan, et al.
Published: (2026)
by: Li, Xiaochuan, et al.
Published: (2026)
Similar Items
-
FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents
by: Shao, Jiaqi, et al.
Published: (2025) -
MorphAgent: Empowering Agents through Self-Evolving Profiles and Decentralized Collaboration
by: Lu, Siyuan, et al.
Published: (2024) -
Cognitive Insights and Stable Coalition Matching for Fostering Multi-Agent Cooperation
by: Shao, Jiaqi, et al.
Published: (2024) -
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
by: Trinh, Tu, et al.
Published: (2026) -
Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents
by: Wang, Yufeng
Published: (2026)