:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yan, Shuo, Li, Ruochen, Luo, Ziming, Wang, Zimu, Li, Daoyang, Jing, Liqiang, He, Kaiyu, Wu, Peilin, Michalopoulos, George, Zhang, Yue, Zhang, Ziyang, Zhang, Mian, Chen, Zhiyu, Du, Xinya
Format:	Preprint
Published:	2025
Subjects:	Software Engineering Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.17335
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers
by: He, Kaiyu, et al.
Published: (2026)

IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction
by: He, Kaiyu, et al.
Published: (2024)

GEAR: A General Evaluation Framework for Abductive Reasoning
by: He, Kaiyu, et al.
Published: (2025)

HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
by: Wu, Peilin, et al.
Published: (2025)

Search Wisely: Mitigating Sub-optimal Agentic Searches By Reducing Uncertainty
by: Wu, Peilin, et al.
Published: (2025)

DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale
by: Zhang, Linghao, et al.
Published: (2025)

RepDL: Bit-level Reproducible Deep Learning Training and Inference
by: Xie, Peichen, et al.
Published: (2025)

KOCO-BENCH: Can Large Language Models Leverage Domain Knowledge in Software Development?
by: Jiang, Xue, et al.
Published: (2026)

Managing Uncertainty in LLM-based Multi-Agent System Operation
by: Zhang, Man, et al.
Published: (2026)

EffiSkill: Agent Skill Based Automated Code Efficiency Optimization
by: Wang, Zimu, et al.
Published: (2026)

FLOW-BENCH: Towards Conversational Generation of Enterprise Workflows
by: Duesterwald, Evelyn, et al.
Published: (2025)

LogiAgent: Automated Logical Testing for REST Systems with LLM-Based Multi-Agents
by: Zhang, Ke, et al.
Published: (2025)

Vulseye: Detect Smart Contract Vulnerabilities via Stateful Directed Graybox Fuzzing
by: Liang, Ruichao, et al.
Published: (2024)

GraphCodeAgent: Dual Graph-Guided LLM Agent for Retrieval-Augmented Repo-Level Code Generation
by: Li, Jia, et al.
Published: (2025)

AgentHub: A Registry for Discoverable, Verifiable, and Reproducible AI Agents
by: Pautsch, Erik, et al.
Published: (2025)

SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?
by: Chen, Shiqi, et al.
Published: (2026)

COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis
by: Yang, Weiqing, et al.
Published: (2024)

MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences
by: Wang, Qihao, et al.
Published: (2026)

Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach
by: Mao, Zhenyu, et al.
Published: (2025)

GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
by: Ni, Ziyi, et al.
Published: (2025)

AgentFM: Role-Aware Failure Management for Distributed Databases with LLM-Driven Multi-Agents
by: Zhang, Lingzhe, et al.
Published: (2025)

Temac: Multi-Agent Collaboration for Automated Web GUI Testing
by: Liu, Chenxu, et al.
Published: (2025)

Environment-in-the-Loop: Rethinking Code Migration with LLM-based Agents
by: Li, Xiang, et al.
Published: (2026)

Benchmarking and Studying the LLM-based Agent System in End-to-End Software Development
by: Zeng, Zhengran, et al.
Published: (2025)

Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements
by: Li, Yue, et al.
Published: (2026)

Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation
by: Zhang, Binquan, et al.
Published: (2025)

SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
by: Guo, Zihan, et al.
Published: (2026)

Can Coding Agents Reproduce Findings in Computational Materials Science?
by: Huang, Ziyang, et al.
Published: (2026)

CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System
by: Hu, Li, et al.
Published: (2025)

QLPro: Automated Code Vulnerability Discovery via LLM and Static Code Analysis Integration
by: Hu, Junze, et al.
Published: (2025)

An Empirical Study of Bugs in Modern LLM Agent Frameworks
by: Zhu, Xinxue, et al.
Published: (2026)

IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
by: Li, Ziyang, et al.
Published: (2024)

Crystal: Illuminating LLM Abilities on Language and Code
by: Tao, Tianhua, et al.
Published: (2024)

Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications
by: Ma, Wei, et al.
Published: (2024)

LLM Collaboration With Multi-Agent Reinforcement Learning
by: Liu, Shuo, et al.
Published: (2025)

VulnResolver: A Hybrid Agent Framework for LLM-Based Automated Vulnerability Issue Resolution
by: Zhang, Mingming, et al.
Published: (2026)

PathFuzzing: Worst Case Analysis by Fuzzing Symbolic-Execution Paths
by: Chen, Zimu, et al.
Published: (2025)

What Makes a Good LLM Agent for Real-world Penetration Testing?
by: Deng, Gelei, et al.
Published: (2026)

InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration
by: Wang, Yunkun, et al.
Published: (2025)

Sphinx: Benchmarking and Modeling for LLM-Driven Pull Request Review
by: Zhang, Daoan, et al.
Published: (2026)