Skip to content
VuFind
  • Login
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
Advanced
  • Cite this
  • Text this
  • Email this
  • Print
  • Export Record
    • Export to RefWorks
    • Export to EndNoteWeb
    • Export to EndNote
  • Save to List
  • Permanent link
Cover Image

Saved in:
Bibliographic Details
Main Authors: Kapoor, Sayash, Stroebl, Benedikt, Kirgis, Peter, Nadgir, Nitya, Siegel, Zachary S, Wei, Boyi, Xue, Tianci, Chen, Ziru, Chen, Felix, Utpala, Saiteja, Ndzomga, Franck, Oruganty, Dheeraj, Luskin, Sophie, Liu, Kangheng, Yu, Botao, Arora, Amit, Hahm, Dongyoon, Trivedi, Harsh, Sun, Huan, Lee, Juyong, Jin, Tengjun, Mai, Yifan, Zhou, Yifei, Zhu, Yuxuan, Bommasani, Rishi, Kang, Daniel, Song, Dawn, Henderson, Peter, Su, Yu, Liang, Percy, Narayanan, Arvind
Format: Preprint
Published: 2025
Subjects:
Artificial Intelligence
Computation and Language
Online Access:https://arxiv.org/abs/2510.11977
Tags: Add Tag
No Tags, Be the first to tag this record!
  • Holdings
  • Description
  • Table of Contents
  • Comments
  • Similar Items
  • Staff View

Internet

https://arxiv.org/abs/2510.11977

Similar Items

  • Towards a Science of AI Agent Reliability
    by: Rabanser, Stephan, et al.
    Published: (2026)
  • AI Agents That Matter
    by: Kapoor, Sayash, et al.
    Published: (2024)
  • CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark
    by: Siegel, Zachary S., et al.
    Published: (2024)
  • The Limits of Inference Scaling Through Resampling
    by: Stroebl, Benedikt, et al.
    Published: (2024)
  • Log analysis is necessary for credible evaluation of AI agents
    by: Kirgis, Peter, et al.
    Published: (2026)

Search Options

  • Search History
  • Advanced Search

Find More

  • Browse the Catalog
  • Browse Alphabetically
  • Explore Channels
  • Course Reserves
  • New Items

Need Help?

  • Search Tips
  • Ask a Librarian
  • FAQs