:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wu, Shirley, Galley, Michel, Peng, Baolin, Cheng, Hao, Li, Gavin, Dou, Yao, Cai, Weixin, Zou, James, Leskovec, Jure, Gao, Jianfeng
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.00640
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SimulatorArena: Are User Simulators Reliable Proxies for Multi-Turn Evaluation of AI Assistants?
by: Dou, Yao, et al.
Published: (2025)

Teaching Language Models to Self-Improve through Interactive Demonstrations
by: Yu, Xiao, et al.
Published: (2023)

Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
by: Li, Miaoran, et al.
Published: (2023)

ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
by: Yu, Xiao, et al.
Published: (2024)

Found in Conversation: LLMs Teach Themselves to Close the Multi-Turn Gap
by: Chen, Tianlang, et al.
Published: (2026)

Dyna-Think: Synergizing Reasoning, Acting, and World Model Simulation in AI Agents
by: Yu, Xiao, et al.
Published: (2025)

GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts
by: Wu, Shirley, et al.
Published: (2023)

Dyna-Mind: Learning to Simulate from Experience for Better AI Agents
by: Yu, Xiao, et al.
Published: (2025)

Uncalibrated Reasoning: GRPO Induces Overconfidence for Stochastic Outcomes
by: Bereket, Michael, et al.
Published: (2025)

Synthetic Computers at Scale for Long-Horizon Productivity Simulation
by: Ge, Tao, et al.
Published: (2026)

ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration
by: Estornell, Andrew, et al.
Published: (2024)

RelGNN: Composite Message Passing for Relational Deep Learning
by: Chen, Tianlang, et al.
Published: (2025)

Reverse Image Retrieval Cues Parametric Memory in Multimodal LLMs
by: Xu, Jialiang, et al.
Published: (2024)

STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases
by: Wu, Shirley, et al.
Published: (2024)

AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
by: Wu, Shirley, et al.
Published: (2024)

Large Language Models are Good Relational Learners
by: Wu, Fang, et al.
Published: (2025)

CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis
by: Venkatraman, Saranya, et al.
Published: (2024)

Rethinking Interpretability in the Era of Large Language Models
by: Singh, Chandan, et al.
Published: (2024)

AgentCollab: A Self-Evaluation-Driven Collaboration Paradigm for Efficient LLM Agents
by: Gao, Wenbo, et al.
Published: (2026)

Collab-Solver: Collaborative Solving Policy Learning for Mixed-Integer Linear Programming
by: Li, Siyuan, et al.
Published: (2025)

RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance
by: Chen, Tianlang, et al.
Published: (2025)

MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
by: Huang, Qian, et al.
Published: (2023)

Collab-Overcooked: Benchmarking and Evaluating Large Language Models as Collaborative Agents
by: Sun, Haochen, et al.
Published: (2025)

CollabEval: Enhancing LLM-as-a-Judge via Multi-Agent Collaboration
by: Qian, Yiyue, et al.
Published: (2026)

PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
by: Yang, Ke, et al.
Published: (2026)

The Tool Illusion: Rethinking Tool Use in Web Agents
by: Lou, Renze, et al.
Published: (2026)

Relational Deep Learning: Challenges, Foundations and Next-Generation Architectures
by: Dwivedi, Vijay Prakash, et al.
Published: (2025)

Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities
by: Sun, Chung-En, et al.
Published: (2024)

AutoSurfer -- Teaching Web Agents through Comprehensive Surfing, Learning, and Modeling
by: Faisal, Fazle Elahi, et al.
Published: (2026)

GLEAN: Active Generalized Category Discovery with Diverse LLM Feedback
by: Zou, Henry Peng, et al.
Published: (2025)

Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global Evolution
by: Wu, Tailin, et al.
Published: (2024)

TimeGraphs: Graph-based Temporal Reasoning
by: Maheshwari, Paridhi, et al.
Published: (2024)

Inferring Dynamic Networks from Marginals with Iterative Proportional Fitting
by: Chang, Serina, et al.
Published: (2024)

Learning over Positive and Negative Edges with Contrastive Message Passing
by: Pao-Huang, Peter, et al.
Published: (2026)

SigmaCollab: An Application-Driven Dataset for Physically Situated Collaboration
by: Bohus, Dan, et al.
Published: (2025)

CollabEdit: Towards Non-destructive Collaborative Knowledge Editing
by: Zheng, Jiamu, et al.
Published: (2024)

Test-Time Learning with an Evolving Library
by: Xu, Weijia, et al.
Published: (2026)

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents
by: Yang, Rui, et al.
Published: (2026)

HumanLM: Simulating Users with State Alignment Beats Response Imitation
by: Wu, Shirley, et al.
Published: (2026)

Learning Efficient Positional Encodings with Graph Neural Networks
by: Kanatsoulis, Charilaos I., et al.
Published: (2025)