Saved in:
| Main Authors: | Cai, Xiaoran, Yang, Wang, Ren, Xiyu, Law, Chekun, Sharma, Rohit, Qi, Peng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.17106 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating Human-AI Collaboration: A Review and Methodological Framework
by: Fragiadakis, George, et al.
Published: (2024)
by: Fragiadakis, George, et al.
Published: (2024)
Claw-Eval: Towards Trustworthy Evaluation of Autonomous Agents
by: Ye, Bowen, et al.
Published: (2026)
by: Ye, Bowen, et al.
Published: (2026)
Towards Competent AI for Fundamental Analysis in Finance: A Benchmark Dataset and Evaluation
by: Wu, Zonghan, et al.
Published: (2025)
by: Wu, Zonghan, et al.
Published: (2025)
Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning
by: Linze, Chen, et al.
Published: (2026)
by: Linze, Chen, et al.
Published: (2026)
Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning
by: Chen, Linze, et al.
Published: (2025)
by: Chen, Linze, et al.
Published: (2025)
Improving Health Professionals' Onboarding with AI and XAI for Trustworthy Human-AI Collaborative Decision Making
by: Lee, Min Hun, et al.
Published: (2024)
by: Lee, Min Hun, et al.
Published: (2024)
Ethical AI: Towards Defining a Collective Evaluation Framework
by: Sharma, Aasish Kumar, et al.
Published: (2025)
by: Sharma, Aasish Kumar, et al.
Published: (2025)
A Study on the Framework for Evaluating the Ethics and Trustworthiness of Generative AI
by: Jeong, Cheonsu, et al.
Published: (2025)
by: Jeong, Cheonsu, et al.
Published: (2025)
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
by: Huang, Jinsheng, et al.
Published: (2024)
by: Huang, Jinsheng, et al.
Published: (2024)
Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
by: Shao, Yijia, et al.
Published: (2024)
by: Shao, Yijia, et al.
Published: (2024)
Explainable AI for Maritime Autonomous Surface Ships (MASS): Adaptive Interfaces and Trustworthy Human-AI Collaboration
by: Zhang, Zhuoyue, et al.
Published: (2025)
by: Zhang, Zhuoyue, et al.
Published: (2025)
Trustworthy Human-AI Collaboration: Reinforcement Learning with Human Feedback and Physics Knowledge for Safe Autonomous Driving
by: Huang, Zilin, et al.
Published: (2024)
by: Huang, Zilin, et al.
Published: (2024)
Designing The Internet of Agents: A Framework for Trustworthy, Transparent, and Collaborative Human-Agent Interaction (HAX)
by: Scibelli, Marc, et al.
Published: (2025)
by: Scibelli, Marc, et al.
Published: (2025)
Trust the AI, Doubt Yourself: The Effect of Urgency on Self-Confidence in Human-AI Interaction
by: Shajari, Baran, et al.
Published: (2026)
by: Shajari, Baran, et al.
Published: (2026)
From Correctness to Collaboration: Toward a Human-Centered Framework for Evaluating AI Agent Behavior in Software Engineering
by: Dong, Tao, et al.
Published: (2025)
by: Dong, Tao, et al.
Published: (2025)
AI Benchmarks and Datasets for LLM Evaluation
by: Ivanov, Todor, et al.
Published: (2024)
by: Ivanov, Todor, et al.
Published: (2024)
Towards Safer AI Moderation: Evaluating LLM Moderators Through a Unified Benchmark Dataset and Advocating a Human-First Approach
by: Machlovi, Naseem, et al.
Published: (2025)
by: Machlovi, Naseem, et al.
Published: (2025)
Human-Centered Human-AI Collaboration (HCHAC)
by: Gao, Qi, et al.
Published: (2025)
by: Gao, Qi, et al.
Published: (2025)
Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs
by: Xiao, Changrong, et al.
Published: (2024)
by: Xiao, Changrong, et al.
Published: (2024)
Toward Trustworthy Agentic AI: A Multimodal Framework for Preventing Prompt Injection Attacks
by: Syed, Toqeer Ali, et al.
Published: (2025)
by: Syed, Toqeer Ali, et al.
Published: (2025)
Decidable By Construction: Design-Time Verification for Trustworthy AI
by: Haynes, Houston
Published: (2026)
by: Haynes, Houston
Published: (2026)
Causal Responsibility Attribution for Human-AI Collaboration
by: Qi, Yahang, et al.
Published: (2024)
by: Qi, Yahang, et al.
Published: (2024)
Beyond Benchmark Islands: Toward Representative Trustworthiness Evaluation for Agentic AI
by: Qi, Jinhu, et al.
Published: (2026)
by: Qi, Jinhu, et al.
Published: (2026)
AI Simulation by Digital Twins: Systematic Survey, Reference Framework, and Mapping to a Standardized Architecture
by: Liu, Xiaoran, et al.
Published: (2025)
by: Liu, Xiaoran, et al.
Published: (2025)
Context Engineering: A Practitioner Methodology for Structured Human-AI Collaboration
by: Calboreanu, Elias
Published: (2026)
by: Calboreanu, Elias
Published: (2026)
NavTrust: Benchmarking Trustworthiness for Embodied Navigation
by: Jiang, Huaide, et al.
Published: (2026)
by: Jiang, Huaide, et al.
Published: (2026)
HAIM: Human-AI Music Datasets for AI Music Production Tracking Benchmark
by: Go, Seonghyeon, et al.
Published: (2026)
by: Go, Seonghyeon, et al.
Published: (2026)
Towards a Comprehensive Human-Centred Evaluation Framework for Explainable AI
by: Donoso-Guzmán, Ivania, et al.
Published: (2023)
by: Donoso-Guzmán, Ivania, et al.
Published: (2023)
UniMIC: Token-Based Multimodal Interactive Coding for Human-AI Collaboration
by: Mao, Qi, et al.
Published: (2025)
by: Mao, Qi, et al.
Published: (2025)
Design Principles for the Construction of a Benchmark Evaluating Security Operation Capabilities of Multi-agent AI Systems
by: Cai, Yicheng, et al.
Published: (2026)
by: Cai, Yicheng, et al.
Published: (2026)
Advancing Trustworthy AI for Sustainable Development: Recommendations for Standardising AI Incident Reporting
by: Agarwal, Avinash, et al.
Published: (2025)
by: Agarwal, Avinash, et al.
Published: (2025)
The Journey to Trustworthy AI: Pursuit of Pragmatic Frameworks
by: Nasr-Azadani, Mohamad M, et al.
Published: (2024)
by: Nasr-Azadani, Mohamad M, et al.
Published: (2024)
Towards AI-$45^{\circ}$ Law: A Roadmap to Trustworthy AGI
by: Yang, Chao, et al.
Published: (2024)
by: Yang, Chao, et al.
Published: (2024)
Reliability, Resilience and Human Factors Engineering for Trustworthy AI Systems
by: Mishra, Saurabh, et al.
Published: (2024)
by: Mishra, Saurabh, et al.
Published: (2024)
Trustworthy and Responsible AI for Human-Centric Autonomous Decision-Making Systems
by: Dehghani, Farzaneh, et al.
Published: (2024)
by: Dehghani, Farzaneh, et al.
Published: (2024)
Bridging the Communication Gap: Evaluating AI Labeling Practices for Trustworthy AI Development
by: Fischer, Raphael, et al.
Published: (2025)
by: Fischer, Raphael, et al.
Published: (2025)
Towards Responsible AI Music: an Investigation of Trustworthy Features for Creative Systems
by: de Berardinis, Jacopo, et al.
Published: (2025)
by: de Berardinis, Jacopo, et al.
Published: (2025)
A No Free Lunch Theorem for Human-AI Collaboration
by: Peng, Kenny, et al.
Published: (2024)
by: Peng, Kenny, et al.
Published: (2024)
Agentic AI as Undercover Teammates: Argumentative Knowledge Construction in Hybrid Human-AI Collaborative Learning
by: Yan, Lixiang, et al.
Published: (2025)
by: Yan, Lixiang, et al.
Published: (2025)
A Knowledge-Component-Based Methodology for Evaluating AI Assistants
by: Qi, Laryn, et al.
Published: (2024)
by: Qi, Laryn, et al.
Published: (2024)
Similar Items
-
Evaluating Human-AI Collaboration: A Review and Methodological Framework
by: Fragiadakis, George, et al.
Published: (2024) -
Claw-Eval: Towards Trustworthy Evaluation of Autonomous Agents
by: Ye, Bowen, et al.
Published: (2026) -
Towards Competent AI for Fundamental Analysis in Finance: A Benchmark Dataset and Evaluation
by: Wu, Zonghan, et al.
Published: (2025) -
Which Changes Matter? Towards Trustworthy Legal AI via Relevance-Sensitive Evaluation and Solver-Grounded Reasoning
by: Linze, Chen, et al.
Published: (2026) -
Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning
by: Chen, Linze, et al.
Published: (2025)