Saved in:
| Main Authors: | Yan, Shuo, Li, Ruochen, Luo, Ziming, Wang, Zimu, Li, Daoyang, Jing, Liqiang, He, Kaiyu, Wu, Peilin, Michalopoulos, George, Zhang, Yue, Zhang, Ziyang, Zhang, Mian, Chen, Zhiyu, Du, Xinya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.17335 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers
by: He, Kaiyu, et al.
Published: (2026)
by: He, Kaiyu, et al.
Published: (2026)
IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction
by: He, Kaiyu, et al.
Published: (2024)
by: He, Kaiyu, et al.
Published: (2024)
GEAR: A General Evaluation Framework for Abductive Reasoning
by: He, Kaiyu, et al.
Published: (2025)
by: He, Kaiyu, et al.
Published: (2025)
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
by: Wu, Peilin, et al.
Published: (2025)
by: Wu, Peilin, et al.
Published: (2025)
Search Wisely: Mitigating Sub-optimal Agentic Searches By Reducing Uncertainty
by: Wu, Peilin, et al.
Published: (2025)
by: Wu, Peilin, et al.
Published: (2025)
DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale
by: Zhang, Linghao, et al.
Published: (2025)
by: Zhang, Linghao, et al.
Published: (2025)
RepDL: Bit-level Reproducible Deep Learning Training and Inference
by: Xie, Peichen, et al.
Published: (2025)
by: Xie, Peichen, et al.
Published: (2025)
KOCO-BENCH: Can Large Language Models Leverage Domain Knowledge in Software Development?
by: Jiang, Xue, et al.
Published: (2026)
by: Jiang, Xue, et al.
Published: (2026)
Managing Uncertainty in LLM-based Multi-Agent System Operation
by: Zhang, Man, et al.
Published: (2026)
by: Zhang, Man, et al.
Published: (2026)
EffiSkill: Agent Skill Based Automated Code Efficiency Optimization
by: Wang, Zimu, et al.
Published: (2026)
by: Wang, Zimu, et al.
Published: (2026)
FLOW-BENCH: Towards Conversational Generation of Enterprise Workflows
by: Duesterwald, Evelyn, et al.
Published: (2025)
by: Duesterwald, Evelyn, et al.
Published: (2025)
LogiAgent: Automated Logical Testing for REST Systems with LLM-Based Multi-Agents
by: Zhang, Ke, et al.
Published: (2025)
by: Zhang, Ke, et al.
Published: (2025)
Vulseye: Detect Smart Contract Vulnerabilities via Stateful Directed Graybox Fuzzing
by: Liang, Ruichao, et al.
Published: (2024)
by: Liang, Ruichao, et al.
Published: (2024)
GraphCodeAgent: Dual Graph-Guided LLM Agent for Retrieval-Augmented Repo-Level Code Generation
by: Li, Jia, et al.
Published: (2025)
by: Li, Jia, et al.
Published: (2025)
AgentHub: A Registry for Discoverable, Verifiable, and Reproducible AI Agents
by: Pautsch, Erik, et al.
Published: (2025)
by: Pautsch, Erik, et al.
Published: (2025)
SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?
by: Chen, Shiqi, et al.
Published: (2026)
by: Chen, Shiqi, et al.
Published: (2026)
COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis
by: Yang, Weiqing, et al.
Published: (2024)
by: Yang, Weiqing, et al.
Published: (2024)
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences
by: Wang, Qihao, et al.
Published: (2026)
by: Wang, Qihao, et al.
Published: (2026)
Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach
by: Mao, Zhenyu, et al.
Published: (2025)
by: Mao, Zhenyu, et al.
Published: (2025)
GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
by: Ni, Ziyi, et al.
Published: (2025)
by: Ni, Ziyi, et al.
Published: (2025)
AgentFM: Role-Aware Failure Management for Distributed Databases with LLM-Driven Multi-Agents
by: Zhang, Lingzhe, et al.
Published: (2025)
by: Zhang, Lingzhe, et al.
Published: (2025)
Temac: Multi-Agent Collaboration for Automated Web GUI Testing
by: Liu, Chenxu, et al.
Published: (2025)
by: Liu, Chenxu, et al.
Published: (2025)
Environment-in-the-Loop: Rethinking Code Migration with LLM-based Agents
by: Li, Xiang, et al.
Published: (2026)
by: Li, Xiang, et al.
Published: (2026)
Benchmarking and Studying the LLM-based Agent System in End-to-End Software Development
by: Zeng, Zhengran, et al.
Published: (2025)
by: Zeng, Zhengran, et al.
Published: (2025)
Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements
by: Li, Yue, et al.
Published: (2026)
by: Li, Yue, et al.
Published: (2026)
Are They All Good? Evaluating the Quality of CoTs in LLM-based Code Generation
by: Zhang, Binquan, et al.
Published: (2025)
by: Zhang, Binquan, et al.
Published: (2025)
SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
by: Guo, Zihan, et al.
Published: (2026)
by: Guo, Zihan, et al.
Published: (2026)
Can Coding Agents Reproduce Findings in Computational Materials Science?
by: Huang, Ziyang, et al.
Published: (2026)
by: Huang, Ziyang, et al.
Published: (2026)
CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System
by: Hu, Li, et al.
Published: (2025)
by: Hu, Li, et al.
Published: (2025)
QLPro: Automated Code Vulnerability Discovery via LLM and Static Code Analysis Integration
by: Hu, Junze, et al.
Published: (2025)
by: Hu, Junze, et al.
Published: (2025)
An Empirical Study of Bugs in Modern LLM Agent Frameworks
by: Zhu, Xinxue, et al.
Published: (2026)
by: Zhu, Xinxue, et al.
Published: (2026)
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
by: Li, Ziyang, et al.
Published: (2024)
by: Li, Ziyang, et al.
Published: (2024)
Crystal: Illuminating LLM Abilities on Language and Code
by: Tao, Tianhua, et al.
Published: (2024)
by: Tao, Tianhua, et al.
Published: (2024)
Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications
by: Ma, Wei, et al.
Published: (2024)
by: Ma, Wei, et al.
Published: (2024)
LLM Collaboration With Multi-Agent Reinforcement Learning
by: Liu, Shuo, et al.
Published: (2025)
by: Liu, Shuo, et al.
Published: (2025)
VulnResolver: A Hybrid Agent Framework for LLM-Based Automated Vulnerability Issue Resolution
by: Zhang, Mingming, et al.
Published: (2026)
by: Zhang, Mingming, et al.
Published: (2026)
PathFuzzing: Worst Case Analysis by Fuzzing Symbolic-Execution Paths
by: Chen, Zimu, et al.
Published: (2025)
by: Chen, Zimu, et al.
Published: (2025)
What Makes a Good LLM Agent for Real-world Penetration Testing?
by: Deng, Gelei, et al.
Published: (2026)
by: Deng, Gelei, et al.
Published: (2026)
InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration
by: Wang, Yunkun, et al.
Published: (2025)
by: Wang, Yunkun, et al.
Published: (2025)
Sphinx: Benchmarking and Modeling for LLM-Driven Pull Request Review
by: Zhang, Daoan, et al.
Published: (2026)
by: Zhang, Daoan, et al.
Published: (2026)
Similar Items
-
Is Grokking Worthwhile? Functional Analysis and Transferability of Generalization Circuits in Transformers
by: He, Kaiyu, et al.
Published: (2026) -
IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction
by: He, Kaiyu, et al.
Published: (2024) -
GEAR: A General Evaluation Framework for Abductive Reasoning
by: He, Kaiyu, et al.
Published: (2025) -
HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
by: Wu, Peilin, et al.
Published: (2025) -
Search Wisely: Mitigating Sub-optimal Agentic Searches By Reducing Uncertainty
by: Wu, Peilin, et al.
Published: (2025)