Saved in:
| Main Authors: | Trivedi, Priyansh, Schmitt, Olivier |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.20049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?
by: Tarshish, Noam, et al.
Published: (2026)
by: Tarshish, Noam, et al.
Published: (2026)
RedCode: Risky Code Execution and Generation Benchmark for Code Agents
by: Guo, Chengquan, et al.
Published: (2024)
by: Guo, Chengquan, et al.
Published: (2024)
Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code
by: Della Porta, Antonio, et al.
Published: (2025)
by: Della Porta, Antonio, et al.
Published: (2025)
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach
by: Wan, Yao, et al.
Published: (2024)
by: Wan, Yao, et al.
Published: (2024)
AIDev: Studying AI Coding Agents on GitHub
by: Li, Hao, et al.
Published: (2026)
by: Li, Hao, et al.
Published: (2026)
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
by: Trivedi, Harsh, et al.
Published: (2024)
by: Trivedi, Harsh, et al.
Published: (2024)
How Do Agents Perform Code Optimization? An Empirical Study
by: Peng, Huiyun, et al.
Published: (2025)
by: Peng, Huiyun, et al.
Published: (2025)
Code Review Agent Benchmark
by: Zhang, Yuntong, et al.
Published: (2026)
by: Zhang, Yuntong, et al.
Published: (2026)
Theory of Code Space: Do Code Agents Understand Software Architecture?
by: Sapunov, Grigory
Published: (2026)
by: Sapunov, Grigory
Published: (2026)
A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama
by: Cursaru, Vlad-Andrei, et al.
Published: (2024)
by: Cursaru, Vlad-Andrei, et al.
Published: (2024)
Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation
by: Chen, Le, et al.
Published: (2025)
by: Chen, Le, et al.
Published: (2025)
Analyzing Message-Code Inconsistency in AI Coding Agent-Authored Pull Requests
by: Gong, Jingzhi, et al.
Published: (2026)
by: Gong, Jingzhi, et al.
Published: (2026)
Code Researcher: Deep Research Agent for Large Systems Code and Commit History
by: Singh, Ramneet, et al.
Published: (2025)
by: Singh, Ramneet, et al.
Published: (2025)
Understanding Code Agent Behaviour: An Empirical Study of Success and Failure Trajectories
by: Majgaonkar, Oorja, et al.
Published: (2025)
by: Majgaonkar, Oorja, et al.
Published: (2025)
Reflection-Driven Control for Trustworthy Code Agents
by: Wang, Bin, et al.
Published: (2025)
by: Wang, Bin, et al.
Published: (2025)
Workflows vs Agents for Code Translation
by: Gray, Henry, et al.
Published: (2025)
by: Gray, Henry, et al.
Published: (2025)
miniCodeProps: a Minimal Benchmark for Proving Code Properties
by: Lohn, Evan, et al.
Published: (2024)
by: Lohn, Evan, et al.
Published: (2024)
Coherence Collapse: Diagnosing Why Code Agents Fail After Reaching the Right Code
by: Kim, Myeongsoo, et al.
Published: (2026)
by: Kim, Myeongsoo, et al.
Published: (2026)
More with Less: An Empirical Study of Turn-Control Strategies for Efficient Coding Agents
by: Gao, Pengfei, et al.
Published: (2025)
by: Gao, Pengfei, et al.
Published: (2025)
SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion
by: Ma, George, et al.
Published: (2025)
by: Ma, George, et al.
Published: (2025)
Scaling Coding Agents via Atomic Skills
by: Ma, Yingwei, et al.
Published: (2026)
by: Ma, Yingwei, et al.
Published: (2026)
LLM Code Customization with Visual Results: A Benchmark on TikZ
by: Reux, Charly, et al.
Published: (2025)
by: Reux, Charly, et al.
Published: (2025)
A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement
by: Zhang, Huan, et al.
Published: (2024)
by: Zhang, Huan, et al.
Published: (2024)
Effective LLM Code Refinement via Property-Oriented and Structurally Minimal Feedback
by: He, Lehan, et al.
Published: (2025)
by: He, Lehan, et al.
Published: (2025)
Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation
by: Vartziotis, Tina, et al.
Published: (2024)
by: Vartziotis, Tina, et al.
Published: (2024)
ProcCtrlBench: Evaluating Process-Level Defects and Control Preservation in LLM Coding Agents
by: He, Jiawei, et al.
Published: (2026)
by: He, Jiawei, et al.
Published: (2026)
GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging
by: Ni, Ziyi, et al.
Published: (2025)
by: Ni, Ziyi, et al.
Published: (2025)
AuPair: Golden Example Pairs for Code Repair
by: Mavalankar, Aditi, et al.
Published: (2025)
by: Mavalankar, Aditi, et al.
Published: (2025)
RA-Gen: A Controllable Code Generation Framework Using ReAct for Multi-Agent Task Execution
by: Liu, Aofan, et al.
Published: (2025)
by: Liu, Aofan, et al.
Published: (2025)
SEMAG: Self-Evolutionary Multi-Agent Code Generation
by: Peng, Yulin, et al.
Published: (2026)
by: Peng, Yulin, et al.
Published: (2026)
CODESTRUCT: Code Agents over Structured Action Spaces
by: Kim, Myeongsoo, et al.
Published: (2026)
by: Kim, Myeongsoo, et al.
Published: (2026)
Studying Vulnerable Code Entities in R
by: Zhao, Zixiao, et al.
Published: (2024)
by: Zhao, Zixiao, et al.
Published: (2024)
Devstral: Fine-tuning Language Models for Coding Agent Applications
by: Rastogi, Abhinav, et al.
Published: (2025)
by: Rastogi, Abhinav, et al.
Published: (2025)
On the Compression of Language Models for Code: An Empirical Study on CodeBERT
by: d'Aloisio, Giordano, et al.
Published: (2024)
by: d'Aloisio, Giordano, et al.
Published: (2024)
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
by: Phan, Huy Nhat, et al.
Published: (2024)
by: Phan, Huy Nhat, et al.
Published: (2024)
DialogAgent: An Auto-engagement Agent for Code Question Answering Data Production
by: Liang, Xiaoyun, et al.
Published: (2024)
by: Liang, Xiaoyun, et al.
Published: (2024)
Can Coding Agents Be General Agents?
by: Ivanov, Maksim, et al.
Published: (2026)
by: Ivanov, Maksim, et al.
Published: (2026)
AI-Generated Code Is Not Reproducible (Yet): An Empirical Study of Dependency Gaps in LLM-Based Coding Agents
by: Vangala, Bhanu Prakash, et al.
Published: (2025)
by: Vangala, Bhanu Prakash, et al.
Published: (2025)
A Performance Study of LLM-Generated Code on Leetcode
by: Coignion, Tristan, et al.
Published: (2024)
by: Coignion, Tristan, et al.
Published: (2024)
Automatic Building Code Review: A Case Study
by: Wan, Hanlong, et al.
Published: (2025)
by: Wan, Hanlong, et al.
Published: (2025)
Similar Items
-
SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?
by: Tarshish, Noam, et al.
Published: (2026) -
RedCode: Risky Code Execution and Generation Benchmark for Code Agents
by: Guo, Chengquan, et al.
Published: (2024) -
Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code
by: Della Porta, Antonio, et al.
Published: (2025) -
Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach
by: Wan, Yao, et al.
Published: (2024) -
AIDev: Studying AI Coding Agents on GitHub
by: Li, Hao, et al.
Published: (2026)