Saved in:
| Main Authors: | Jing, Liqiang, Huang, Zhehui, Wang, Xiaoyang, Yao, Wenlin, Yu, Wenhao, Ma, Kaixin, Zhang, Hongming, Du, Xinya, Yu, Dong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.07703 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
by: He, Hongliang, et al.
Published: (2024)
by: He, Hongliang, et al.
Published: (2024)
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
by: He, Hongliang, et al.
Published: (2024)
by: He, Hongliang, et al.
Published: (2024)
WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms
by: Zhang, Zhisong, et al.
Published: (2025)
by: Zhang, Zhisong, et al.
Published: (2025)
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
by: Yu, Wenhao, et al.
Published: (2023)
by: Yu, Wenhao, et al.
Published: (2023)
Retrieval-augmented GUI Agents with Generative Guidelines
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
by: Cao, Ruisheng, et al.
Published: (2024)
by: Cao, Ruisheng, et al.
Published: (2024)
R-Zero: Self-Evolving Reasoning LLM from Zero Data
by: Huang, Chengsong, et al.
Published: (2025)
by: Huang, Chengsong, et al.
Published: (2025)
LASER: LLM Agent with State-Space Exploration for Web Navigation
by: Ma, Kaixin, et al.
Published: (2023)
by: Ma, Kaixin, et al.
Published: (2023)
Dense X Retrieval: What Retrieval Granularity Should We Use?
by: Chen, Tong, et al.
Published: (2023)
by: Chen, Tong, et al.
Published: (2023)
LDC: Learning to Generate Research Idea with Dynamic Control
by: Li, Ruochen, et al.
Published: (2024)
by: Li, Ruochen, et al.
Published: (2024)
FGAIF: Aligning Large Vision-Language Models with Fine-grained AI Feedback
by: Jing, Liqiang, et al.
Published: (2024)
by: Jing, Liqiang, et al.
Published: (2024)
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model
by: Fang, Tianqing, et al.
Published: (2025)
by: Fang, Tianqing, et al.
Published: (2025)
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
by: Zhang, Hongming, et al.
Published: (2024)
by: Zhang, Hongming, et al.
Published: (2024)
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
by: Ouyang, Siru, et al.
Published: (2024)
by: Ouyang, Siru, et al.
Published: (2024)
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
by: Li, Ziming, et al.
Published: (2024)
by: Li, Ziming, et al.
Published: (2024)
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
by: Yao, Wenlin, et al.
Published: (2024)
by: Yao, Wenlin, et al.
Published: (2024)
Benchmarking Data Science Agents
by: Zhang, Yuge, et al.
Published: (2024)
by: Zhang, Yuge, et al.
Published: (2024)
\$OneMillion-Bench: How Far are Language Agents from Human Experts?
by: Yang, Qianyu, et al.
Published: (2026)
by: Yang, Qianyu, et al.
Published: (2026)
DSAEval: Evaluating Data Science Agents on a Wide Range of Real-World Data Science Problems
by: Sun, Maojun, et al.
Published: (2026)
by: Sun, Maojun, et al.
Published: (2026)
DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025)
by: Zhang, Dan, et al.
Published: (2025)
DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation
by: You, Ziming, et al.
Published: (2025)
by: You, Ziming, et al.
Published: (2025)
DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models
by: Huang, Yiming, et al.
Published: (2024)
by: Huang, Yiming, et al.
Published: (2024)
AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research
by: Chen, Renqi, et al.
Published: (2025)
by: Chen, Renqi, et al.
Published: (2025)
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)
by: Wu, Xuansheng, et al.
Published: (2023)
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning
by: Liu, Fuxiao, et al.
Published: (2023)
by: Liu, Fuxiao, et al.
Published: (2023)
DeFine: Decision-Making with Analogical Reasoning over Factor Profiles
by: Hu, Yebowen, et al.
Published: (2024)
by: Hu, Yebowen, et al.
Published: (2024)
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
by: Hu, Yebowen, et al.
Published: (2024)
by: Hu, Yebowen, et al.
Published: (2024)
Understanding and Enhancing Mamba-Transformer Hybrids for Memory Recall and Language Modeling
by: Lee, Hyunji, et al.
Published: (2025)
by: Lee, Hyunji, et al.
Published: (2025)
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
by: Chen, Ziru, et al.
Published: (2024)
by: Chen, Ziru, et al.
Published: (2024)
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas
by: Wang, Xiaoyang, et al.
Published: (2025)
by: Wang, Xiaoyang, et al.
Published: (2025)
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
by: Fang, Tianqing, et al.
Published: (2025)
by: Fang, Tianqing, et al.
Published: (2025)
Do Retrieval-Augmented Language Models Adapt to Varying User Needs?
by: Wu, Peilin, et al.
Published: (2025)
by: Wu, Peilin, et al.
Published: (2025)
SciDFM: A Large Language Model with Mixture-of-Experts for Science
by: Sun, Liangtai, et al.
Published: (2024)
by: Sun, Liangtai, et al.
Published: (2024)
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)
by: Yue, Murong, et al.
Published: (2024)
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
by: Zou, Anni, et al.
Published: (2024)
by: Zou, Anni, et al.
Published: (2024)
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science
by: Ou, Yixin, et al.
Published: (2025)
by: Ou, Yixin, et al.
Published: (2025)
InFoBench: Evaluating Instruction Following Ability in Large Language Models
by: Qin, Yiwei, et al.
Published: (2024)
by: Qin, Yiwei, et al.
Published: (2024)
How Far Can In-Context Alignment Go? Exploring the State of In-Context Alignment
by: Huang, Heyan, et al.
Published: (2024)
by: Huang, Heyan, et al.
Published: (2024)
BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research
by: Wang, Zifeng, et al.
Published: (2025)
by: Wang, Zifeng, et al.
Published: (2025)
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
by: Zhang, Shaolei, et al.
Published: (2025)
by: Zhang, Shaolei, et al.
Published: (2025)
Similar Items
-
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
by: He, Hongliang, et al.
Published: (2024) -
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
by: He, Hongliang, et al.
Published: (2024) -
WebRollback: Enhancing Web Agents with Explicit Rollback Mechanisms
by: Zhang, Zhisong, et al.
Published: (2025) -
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
by: Yu, Wenhao, et al.
Published: (2023) -
Retrieval-augmented GUI Agents with Generative Guidelines
by: Xu, Ran, et al.
Published: (2025)