:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jing, Liqiang, Huang, Zhehui, Wang, Xiaoyang, Yao, Wenlin, Yu, Wenhao, Ma, Kaixin, Zhang, Hongming, Du, Xinya, Yu, Dong
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2409.07703
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
by: He, Hongliang, et al.
Published: (2024)

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
by: He, Hongliang, et al.
Published: (2024)

WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms
by: Zhang, Zhisong, et al.
Published: (2025)

Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
by: Yu, Wenhao, et al.
Published: (2023)

Retrieval-augmented GUI Agents with Generative Guidelines
by: Xu, Ran, et al.
Published: (2025)

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
by: Cao, Ruisheng, et al.
Published: (2024)

R-Zero: Self-Evolving Reasoning LLM from Zero Data
by: Huang, Chengsong, et al.
Published: (2025)

LASER: LLM Agent with State-Space Exploration for Web Navigation
by: Ma, Kaixin, et al.
Published: (2023)

Dense X Retrieval: What Retrieval Granularity Should We Use?
by: Chen, Tong, et al.
Published: (2023)

LDC: Learning to Generate Research Idea with Dynamic Control
by: Li, Ruochen, et al.
Published: (2024)

FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
by: Jing, Liqiang, et al.
Published: (2024)

WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model
by: Fang, Tianqing, et al.
Published: (2025)

Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
by: Zhang, Hongming, et al.
Published: (2024)

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
by: Ouyang, Siru, et al.
Published: (2024)

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
by: Li, Ziming, et al.
Published: (2024)

HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
by: Yao, Wenlin, et al.
Published: (2024)

Benchmarking Data Science Agents
by: Zhang, Yuge, et al.
Published: (2024)

\$OneMillion-Bench: How Far are Language Agents from Human Experts?
by: Yang, Qianyu, et al.
Published: (2026)

DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems
by: Sun, Maojun, et al.
Published: (2026)

DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025)

DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
by: You, Ziming, et al.
Published: (2025)

DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models
by: Huang, Yiming, et al.
Published: (2024)

AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research
by: Chen, Renqi, et al.
Published: (2025)

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning
by: Liu, Fuxiao, et al.
Published: (2023)

DeFine: Decision-Making with Analogical Reasoning over Factor Profiles
by: Hu, Yebowen, et al.
Published: (2024)

When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
by: Hu, Yebowen, et al.
Published: (2024)

Understanding and Enhancing Mamba-Transformer Hybrids for Memory Recall and Language Modeling
by: Lee, Hyunji, et al.
Published: (2025)

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
by: Chen, Ziru, et al.
Published: (2024)

OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas
by: Wang, Xiaoyang, et al.
Published: (2025)

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
by: Fang, Tianqing, et al.
Published: (2025)

Do Retrieval-Augmented Language Models Adapt to Varying User Needs?
by: Wu, Peilin, et al.
Published: (2025)

SciDFM: A Large Language Model with Mixture-of-Experts for Science
by: Sun, Liangtai, et al.
Published: (2024)

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)

DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
by: Zou, Anni, et al.
Published: (2024)

AutoMind: Adaptive Knowledgeable Agent for Automated Data Science
by: Ou, Yixin, et al.
Published: (2025)

InFoBench: Evaluating Instruction Following Ability in Large Language Models
by: Qin, Yiwei, et al.
Published: (2024)

How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment
by: Huang, Heyan, et al.
Published: (2024)

BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research
by: Wang, Zifeng, et al.
Published: (2025)

DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
by: Zhang, Shaolei, et al.
Published: (2025)