:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bjarnason, Bjarni Haukur, Silva, André, Monperrus, Martin
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence Software Engineering
Online Access:	https://arxiv.org/abs/2602.07150
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Supersonic: Learning to Generate Source Code Optimizations in C/C++
by: Chen, Zimin, et al.
Published: (2023)

RepairBench: Leaderboard of Frontier Models for Program Repair
by: Silva, André, et al.
Published: (2024)

Generative AI to Generate Test Data Generators
by: Baudry, Benoit, et al.
Published: (2024)

RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair
by: Silva, André, et al.
Published: (2023)

Bootstrapping Coding Agents: The Specification Is the Program
by: Monperrus, Martin
Published: (2026)

Gradient-Based Program Repair: Fixing Bugs in Continuous Program Spaces
by: Silva, André, et al.
Published: (2025)

LiCoEval: Evaluating LLMs on License Compliance in Code Generation
by: Xu, Weiwei, et al.
Published: (2024)

PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts
by: Andersson, Vivi, et al.
Published: (2025)

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X
by: Zheng, Qinkai, et al.
Published: (2023)

ITER: Iterative Neural Repair for Multi-Location Patches
by: Ye, He, et al.
Published: (2023)

Towards Robust Agentic CUDA Kernel Benchmarking, Verification, and Optimization
by: Lange, Robert Tjarko, et al.
Published: (2025)

GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries
by: Hou, Shuyang, et al.
Published: (2025)

Scaling Test-Time Compute for Agentic Coding
by: Kim, Joongwon, et al.
Published: (2026)

AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detection
by: Charoenwet, Wachiraphan, et al.
Published: (2026)

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study
by: Storhaug, André, et al.
Published: (2024)

mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
by: Skurla, Adam, et al.
Published: (2026)

NoFunEval: Funny How Code LMs Falter on Requirements Beyond Functional Correctness
by: Singhal, Manav, et al.
Published: (2024)

CP-Agent: Agentic Constraint Programming
by: Szeider, Stefan
Published: (2025)

AFlow: Automating Agentic Workflow Generation
by: Zhang, Jiayi, et al.
Published: (2024)

Leveraging XP and CRISP-DM for Agile Data Science Projects
by: Shimaoka, Andre Massahiro, et al.
Published: (2025)

Learning to Compose for Cross-domain Agentic Workflow Generation
by: Wang, Jialiang, et al.
Published: (2026)

Assessing Large Language Models for Automated Feedback Generation in Learning Programming Problem Solving
by: Silva, Priscylla, et al.
Published: (2025)

Are Sparse Autoencoders Useful for Java Function Bug Detection?
by: Melo, Rui, et al.
Published: (2025)

CigaR: Cost-efficient Program Repair with LLMs
by: Hidvégi, Dávid, et al.
Published: (2024)

Adaptive Detection of Software Aging under Workload Shift
by: Silva, Rafael Jose Moura, et al.
Published: (2025)

CodeTaste: Can LLMs Generate Human-Level Code Refactorings?
by: Thillen, Alex, et al.
Published: (2026)

David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design?
by: Shankar, Shashwat, et al.
Published: (2025)

SWT-Bench: Testing and Validating Real-World Bug-Fixes with Code Agents
by: Mündler, Niels, et al.
Published: (2024)

The Unreasonable Effectiveness of Open Science in AI: A Replication Study
by: Gundersen, Odd Erik, et al.
Published: (2024)

GPU Kernel Scientist: An LLM-Driven Framework for Iterative Kernel Optimization
by: Andrews, Martin, et al.
Published: (2025)

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation
by: Kozyrev, Andrei, et al.
Published: (2025)

Adaptation of XAI to Auto-tuning for Numerical Libraries
by: Aoki, Shota, et al.
Published: (2024)

Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
by: Dai, Hankun, et al.
Published: (2025)

From Legacy Fortran to Portable Kokkos: An Autonomous Agentic AI Workflow
by: Gupta, Sparsh, et al.
Published: (2025)

Enhancing LLM-Based Test Generation by Eliminating Covered Code
by: Xu, WeiZhe, et al.
Published: (2026)

Can LLMs Reason Like Automated Theorem Provers for Rust Verification? VCoT-Bench: Evaluating via Verification Chain of Thought
by: Xie, Zichen, et al.
Published: (2026)

R-LAM: Reproducibility-Constrained Large Action Models for Scientific Workflow Automation
by: Sureshkumar, Suriya
Published: (2026)

Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis
by: Deshpande, Darshan, et al.
Published: (2026)

The Causal Impact of Tool Affordance on Safety Alignment in LLM Agents
by: Yu, Shasha, et al.
Published: (2026)

An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code
by: Vulićević, Jelena Ilić
Published: (2026)