Saved in:
| Main Authors: | Ding, Peng, Stevens, Rick |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.21405 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs
by: Ding, Peng, et al.
Published: (2025)
by: Ding, Peng, et al.
Published: (2025)
TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories
by: Dong, Honghua, et al.
Published: (2025)
by: Dong, Honghua, et al.
Published: (2025)
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
by: Misra, Diganta, et al.
Published: (2025)
by: Misra, Diganta, et al.
Published: (2025)
Phyelds: A Pythonic Framework for Aggregate Computing
by: Aguzzi, Gianluca, et al.
Published: (2026)
by: Aguzzi, Gianluca, et al.
Published: (2026)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback
by: Peng, Yun, et al.
Published: (2024)
by: Peng, Yun, et al.
Published: (2024)
SACTOR: LLM-Driven Correct and Idiomatic C to Rust Translation with Static Analysis and FFI-Based Verification
by: Zhou, Tianyang, et al.
Published: (2025)
by: Zhou, Tianyang, et al.
Published: (2025)
Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation
by: Chen, Le, et al.
Published: (2025)
by: Chen, Le, et al.
Published: (2025)
dafny-annotator: AI-Assisted Verification of Dafny Programs
by: Poesia, Gabriel, et al.
Published: (2024)
by: Poesia, Gabriel, et al.
Published: (2024)
AI-Assisted Fixes to Code Review Comments at Scale
by: Maddila, Chandra, et al.
Published: (2025)
by: Maddila, Chandra, et al.
Published: (2025)
Towards Neural Synthesis for SMT-Assisted Proof-Oriented Programming
by: Chakraborty, Saikat, et al.
Published: (2024)
by: Chakraborty, Saikat, et al.
Published: (2024)
Fine-Tuning Multilingual Language Models for Code Review: An Empirical Study on Industrial C# Projects
by: Begolli, Igli, et al.
Published: (2025)
by: Begolli, Igli, et al.
Published: (2025)
Benchmarking Large Language Models for ABAP Code Generation: An Empirical Study on Iterative Improvement by Compiler Feedback
by: Wallraven, Stephan, et al.
Published: (2026)
by: Wallraven, Stephan, et al.
Published: (2026)
LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
by: Hind, Michael, et al.
Published: (2026)
by: Hind, Michael, et al.
Published: (2026)
Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support
by: Du, Alexander, et al.
Published: (2026)
by: Du, Alexander, et al.
Published: (2026)
Dynamic Stability of LLM-Generated Code
by: Rajput, Prateek, et al.
Published: (2025)
by: Rajput, Prateek, et al.
Published: (2025)
Is Functional Correctness Enough to Evaluate Code Language Models? Exploring Diversity of Generated Codes
by: Chon, Heejae, et al.
Published: (2024)
by: Chon, Heejae, et al.
Published: (2024)
LecPrompt: A Prompt-based Approach for Logical Error Correction with CodeBERT
by: Xu, Zhenyu, et al.
Published: (2024)
by: Xu, Zhenyu, et al.
Published: (2024)
Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis
by: Huang, Zhechong, et al.
Published: (2025)
by: Huang, Zhechong, et al.
Published: (2025)
Effective LLM-Driven Code Generation with Pythoness
by: Levin, Kyla H., et al.
Published: (2025)
by: Levin, Kyla H., et al.
Published: (2025)
OBsmith: LLM-Powered JavaScript Obfuscator Testing
by: Jiang, Shan, et al.
Published: (2025)
by: Jiang, Shan, et al.
Published: (2025)
Agentic Interpretation: Lattice-Structured Evidence for LLM-Based Program Analysis
by: Mitchell, Jacqueline L., et al.
Published: (2026)
by: Mitchell, Jacqueline L., et al.
Published: (2026)
Executing as You Generate: Hiding Execution Latency in LLM Code Generation
by: Sun, Zhensu, et al.
Published: (2026)
by: Sun, Zhensu, et al.
Published: (2026)
A Declarative Language for Building And Orchestrating LLM-Powered Agent Workflows
by: Daunis, Ivan
Published: (2025)
by: Daunis, Ivan
Published: (2025)
Blueprint First, Model Second: A Framework for Deterministic LLM Workflow
by: Qiu, Libin, et al.
Published: (2025)
by: Qiu, Libin, et al.
Published: (2025)
Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation
by: Fang, Sen, et al.
Published: (2025)
by: Fang, Sen, et al.
Published: (2025)
Once4All: Skeleton-Guided SMT Solver Fuzzing with LLM-Synthesized Generators
by: Sun, Maolin, et al.
Published: (2025)
by: Sun, Maolin, et al.
Published: (2025)
BuildBench: Benchmarking LLM Agents on Compiling Real-World Open-Source Software
by: Zhang, Zehua, et al.
Published: (2025)
by: Zhang, Zehua, et al.
Published: (2025)
An Empirical Study on the Performance and Energy Usage of Compiled Python Code
by: Stoico, Vincenzo, et al.
Published: (2025)
by: Stoico, Vincenzo, et al.
Published: (2025)
ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs
by: Kim, Su-Hyeon, et al.
Published: (2025)
by: Kim, Su-Hyeon, et al.
Published: (2025)
Herb.jl: A Unifying Program Synthesis Library
by: Hinnerichs, Tilman, et al.
Published: (2025)
by: Hinnerichs, Tilman, et al.
Published: (2025)
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
by: Shi, Yuling, et al.
Published: (2024)
by: Shi, Yuling, et al.
Published: (2024)
What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source Projects
by: Robredo, Mikel, et al.
Published: (2025)
by: Robredo, Mikel, et al.
Published: (2025)
A Taxonomy of Prompt Defects in LLM Systems
by: Tian, Haoye, et al.
Published: (2025)
by: Tian, Haoye, et al.
Published: (2025)
EnvTrace: Simulation-Based Semantic Evaluation of LLM Code via Execution Trace Alignment -- Demonstrated at Synchrotron Beamlines
by: van der Vleuten, Noah, et al.
Published: (2025)
by: van der Vleuten, Noah, et al.
Published: (2025)
Evaluating the Performance of Large Language Models in Competitive Programming: A Multi-Year, Multi-Grade Analysis
by: Dumitran, Adrian Marius, et al.
Published: (2024)
by: Dumitran, Adrian Marius, et al.
Published: (2024)
AbstractBeam: Enhancing Bottom-Up Program Synthesis using Library Learning
by: Zenkner, Janis, et al.
Published: (2024)
by: Zenkner, Janis, et al.
Published: (2024)
Ranking LLM-Generated Loop Invariants for Program Verification
by: Chakraborty, Saikat, et al.
Published: (2023)
by: Chakraborty, Saikat, et al.
Published: (2023)
Benchmarking LLM Code Generation for Audio Programming with Visual Dataflow Languages
by: Zhang, William, et al.
Published: (2024)
by: Zhang, William, et al.
Published: (2024)
On the Effectiveness of Machine Learning-based Call Graph Pruning: An Empirical Study
by: Mir, Amir M., et al.
Published: (2024)
by: Mir, Amir M., et al.
Published: (2024)
Functional Programming Paradigm of Python for Scientific Computation Pipeline Integration
by: Zhang, Chen, et al.
Published: (2024)
by: Zhang, Chen, et al.
Published: (2024)
Similar Items
-
ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs
by: Ding, Peng, et al.
Published: (2025) -
TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories
by: Dong, Honghua, et al.
Published: (2025) -
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
by: Misra, Diganta, et al.
Published: (2025) -
Phyelds: A Pythonic Framework for Aggregate Computing
by: Aguzzi, Gianluca, et al.
Published: (2026) -
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback
by: Peng, Yun, et al.
Published: (2024)