Saved in:
| Main Authors: | Bjarnason, Bjarni Haukur, Silva, André, Monperrus, Martin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.07150 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Supersonic: Learning to Generate Source Code Optimizations in C/C++
by: Chen, Zimin, et al.
Published: (2023)
by: Chen, Zimin, et al.
Published: (2023)
RepairBench: Leaderboard of Frontier Models for Program Repair
by: Silva, André, et al.
Published: (2024)
by: Silva, André, et al.
Published: (2024)
Generative AI to Generate Test Data Generators
by: Baudry, Benoit, et al.
Published: (2024)
by: Baudry, Benoit, et al.
Published: (2024)
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair
by: Silva, André, et al.
Published: (2023)
by: Silva, André, et al.
Published: (2023)
Bootstrapping Coding Agents: The Specification Is the Program
by: Monperrus, Martin
Published: (2026)
by: Monperrus, Martin
Published: (2026)
Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces
by: Silva, André, et al.
Published: (2025)
by: Silva, André, et al.
Published: (2025)
LiCoEval: Evaluating LLMs on License Compliance in Code Generation
by: Xu, Weiwei, et al.
Published: (2024)
by: Xu, Weiwei, et al.
Published: (2024)
PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts
by: Andersson, Vivi, et al.
Published: (2025)
by: Andersson, Vivi, et al.
Published: (2025)
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
by: Zheng, Qinkai, et al.
Published: (2023)
by: Zheng, Qinkai, et al.
Published: (2023)
ITER: Iterative Neural Repair for Multi-Location Patches
by: Ye, He, et al.
Published: (2023)
by: Ye, He, et al.
Published: (2023)
Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization
by: Lange, Robert Tjarko, et al.
Published: (2025)
by: Lange, Robert Tjarko, et al.
Published: (2025)
GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries
by: Hou, Shuyang, et al.
Published: (2025)
by: Hou, Shuyang, et al.
Published: (2025)
Scaling Test-Time Compute for Agentic Coding
by: Kim, Joongwon, et al.
Published: (2026)
by: Kim, Joongwon, et al.
Published: (2026)
AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detection
by: Charoenwet, Wachiraphan, et al.
Published: (2026)
by: Charoenwet, Wachiraphan, et al.
Published: (2026)
Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study
by: Storhaug, André, et al.
Published: (2024)
by: Storhaug, André, et al.
Published: (2024)
mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
by: Skurla, Adam, et al.
Published: (2026)
by: Skurla, Adam, et al.
Published: (2026)
NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness
by: Singhal, Manav, et al.
Published: (2024)
by: Singhal, Manav, et al.
Published: (2024)
CP-Agent: Agentic Constraint Programming
by: Szeider, Stefan
Published: (2025)
by: Szeider, Stefan
Published: (2025)
AFlow: Automating Agentic Workflow Generation
by: Zhang, Jiayi, et al.
Published: (2024)
by: Zhang, Jiayi, et al.
Published: (2024)
Leveraging XP and CRISP-DM for Agile Data Science Projects
by: Shimaoka, Andre Massahiro, et al.
Published: (2025)
by: Shimaoka, Andre Massahiro, et al.
Published: (2025)
Learning to Compose for Cross-domain Agentic Workflow Generation
by: Wang, Jialiang, et al.
Published: (2026)
by: Wang, Jialiang, et al.
Published: (2026)
Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving
by: Silva, Priscylla, et al.
Published: (2025)
by: Silva, Priscylla, et al.
Published: (2025)
Are Sparse Autoencoders Useful for Java Function Bug Detection?
by: Melo, Rui, et al.
Published: (2025)
by: Melo, Rui, et al.
Published: (2025)
CigaR: Cost-efficient Program Repair with LLMs
by: Hidvégi, Dávid, et al.
Published: (2024)
by: Hidvégi, Dávid, et al.
Published: (2024)
Adaptive Detection of Software Aging under Workload Shift
by: Silva, Rafael Jose Moura, et al.
Published: (2025)
by: Silva, Rafael Jose Moura, et al.
Published: (2025)
CodeTaste: Can LLMs Generate Human-Level Code Refactorings?
by: Thillen, Alex, et al.
Published: (2026)
by: Thillen, Alex, et al.
Published: (2026)
David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design?
by: Shankar, Shashwat, et al.
Published: (2025)
by: Shankar, Shashwat, et al.
Published: (2025)
SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents
by: Mündler, Niels, et al.
Published: (2024)
by: Mündler, Niels, et al.
Published: (2024)
The Unreasonable Effectiveness of Open Science in AI: A Replication Study
by: Gundersen, Odd Erik, et al.
Published: (2024)
by: Gundersen, Odd Erik, et al.
Published: (2024)
GPU Kernel Scientist: An LLM-Driven Framework for Iterative Kernel Optimization
by: Andrews, Martin, et al.
Published: (2025)
by: Andrews, Martin, et al.
Published: (2025)
RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation
by: Kozyrev, Andrei, et al.
Published: (2025)
by: Kozyrev, Andrei, et al.
Published: (2025)
Adaptation of XAI to Auto-tuning for Numerical Libraries
by: Aoki, Shota, et al.
Published: (2024)
by: Aoki, Shota, et al.
Published: (2024)
Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
by: Dai, Hankun, et al.
Published: (2025)
by: Dai, Hankun, et al.
Published: (2025)
From Legacy Fortran to Portable Kokkos: An Autonomous Agentic AI Workflow
by: Gupta, Sparsh, et al.
Published: (2025)
by: Gupta, Sparsh, et al.
Published: (2025)
Enhancing LLM-Based Test Generation by Eliminating Covered Code
by: Xu, WeiZhe, et al.
Published: (2026)
by: Xu, WeiZhe, et al.
Published: (2026)
Can LLMs Reason Like Automated Theorem Provers for Rust Verification? VCoT-Bench: Evaluating via Verification Chain of Thought
by: Xie, Zichen, et al.
Published: (2026)
by: Xie, Zichen, et al.
Published: (2026)
R-LAM: Reproducibility-Constrained Large Action Models for Scientific Workflow Automation
by: Sureshkumar, Suriya
Published: (2026)
by: Sureshkumar, Suriya
Published: (2026)
Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis
by: Deshpande, Darshan, et al.
Published: (2026)
by: Deshpande, Darshan, et al.
Published: (2026)
The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents
by: Yu, Shasha, et al.
Published: (2026)
by: Yu, Shasha, et al.
Published: (2026)
An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code
by: Vulićević, Jelena Ilić
Published: (2026)
by: Vulićević, Jelena Ilić
Published: (2026)
Similar Items
-
Supersonic: Learning to Generate Source Code Optimizations in C/C++
by: Chen, Zimin, et al.
Published: (2023) -
RepairBench: Leaderboard of Frontier Models for Program Repair
by: Silva, André, et al.
Published: (2024) -
Generative AI to Generate Test Data Generators
by: Baudry, Benoit, et al.
Published: (2024) -
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair
by: Silva, André, et al.
Published: (2023) -
Bootstrapping Coding Agents: The Specification Is the Program
by: Monperrus, Martin
Published: (2026)