Saved in:
| Main Authors: | Levin, Kyla H., Gwilt, Kyle, Berger, Emery D., Freund, Stephen N. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.02138 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ChatDBG: Augmenting Debugging with Large Language Models
by: Levin, Kyla H., et al.
Published: (2024)
by: Levin, Kyla H., et al.
Published: (2024)
CoverUp: Effective High Coverage Test Generation for Python
by: Pizzorno, Juan Altmayer, et al.
Published: (2024)
by: Pizzorno, Juan Altmayer, et al.
Published: (2024)
Dynamic Stability of LLM-Generated Code
by: Rajput, Prateek, et al.
Published: (2025)
by: Rajput, Prateek, et al.
Published: (2025)
Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation
by: Chen, Le, et al.
Published: (2025)
by: Chen, Le, et al.
Published: (2025)
Executing as You Generate: Hiding Execution Latency in LLM Code Generation
by: Sun, Zhensu, et al.
Published: (2026)
by: Sun, Zhensu, et al.
Published: (2026)
PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback
by: Peng, Yun, et al.
Published: (2024)
by: Peng, Yun, et al.
Published: (2024)
Benchmarking LLM Code Generation for Audio Programming with Visual Dataflow Languages
by: Zhang, William, et al.
Published: (2024)
by: Zhang, William, et al.
Published: (2024)
Is Functional Correctness Enough to Evaluate Code Language Models? Exploring Diversity of Generated Codes
by: Chon, Heejae, et al.
Published: (2024)
by: Chon, Heejae, et al.
Published: (2024)
EnvTrace: Simulation-Based Semantic Evaluation of LLM Code via Execution Trace Alignment -- Demonstrated at Synchrotron Beamlines
by: van der Vleuten, Noah, et al.
Published: (2025)
by: van der Vleuten, Noah, et al.
Published: (2025)
CodeIF-Bench: Evaluating Instruction-Following Capabilities of Large Language Models in Interactive Code Generation
by: Wang, Peiding, et al.
Published: (2025)
by: Wang, Peiding, et al.
Published: (2025)
Perish or Flourish? A Holistic Evaluation of Large Language Models for Code Generation in Functional Programming
by: Lang, Nguyet-Anh H., et al.
Published: (2026)
by: Lang, Nguyet-Anh H., et al.
Published: (2026)
A Preliminary Study of Multilingual Code Language Models for Code Generation Task Using Translated Benchmarks
by: Dandamudi, Rohit, et al.
Published: (2024)
by: Dandamudi, Rohit, et al.
Published: (2024)
Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation
by: Fang, Sen, et al.
Published: (2025)
by: Fang, Sen, et al.
Published: (2025)
Incoherence as Oracle-less Measure of Error in LLM-Based Code Generation
by: Valentin, Thomas, et al.
Published: (2025)
by: Valentin, Thomas, et al.
Published: (2025)
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
by: Shi, Yuling, et al.
Published: (2024)
by: Shi, Yuling, et al.
Published: (2024)
Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support
by: Du, Alexander, et al.
Published: (2026)
by: Du, Alexander, et al.
Published: (2026)
Assessing GPT-4-Vision's Capabilities in UML-Based Code Generation
by: Antal, Gábor, et al.
Published: (2024)
by: Antal, Gábor, et al.
Published: (2024)
Self-Improving Code Generation via Semantic Entropy and Behavioral Consensus
by: Zhang, Huan, et al.
Published: (2026)
by: Zhang, Huan, et al.
Published: (2026)
SACTOR: LLM-Driven Correct and Idiomatic C to Rust Translation with Static Analysis and FFI-Based Verification
by: Zhou, Tianyang, et al.
Published: (2025)
by: Zhou, Tianyang, et al.
Published: (2025)
From Code Generation to Software Testing: AI Copilot with Context-Based RAG
by: Wang, Yuchen, et al.
Published: (2025)
by: Wang, Yuchen, et al.
Published: (2025)
AutoMCQ -- Automatically Generate Code Comprehension Questions using GenAI
by: Goodfellow, Martin, et al.
Published: (2025)
by: Goodfellow, Martin, et al.
Published: (2025)
What Were You Thinking? An LLM-Driven Large-Scale Study of Refactoring Motivations in Open-Source Projects
by: Robredo, Mikel, et al.
Published: (2025)
by: Robredo, Mikel, et al.
Published: (2025)
Learning to Guarantee Type Correctness in Code Generation through Type-Guided Program Synthesis
by: Huang, Zhechong, et al.
Published: (2025)
by: Huang, Zhechong, et al.
Published: (2025)
Getting Python Types Right with RightTyper
by: Pizzorno, Juan Altmayer, et al.
Published: (2025)
by: Pizzorno, Juan Altmayer, et al.
Published: (2025)
Reconsidering "Reconsidering Custom Memory Allocation"
by: van Kempen, Nicolas, et al.
Published: (2026)
by: van Kempen, Nicolas, et al.
Published: (2026)
Is Self-Repair a Silver Bullet for Code Generation?
by: Olausson, Theo X., et al.
Published: (2023)
by: Olausson, Theo X., et al.
Published: (2023)
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models
by: Chen, Simin, et al.
Published: (2024)
by: Chen, Simin, et al.
Published: (2024)
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation
by: Sun, Zhensu, et al.
Published: (2024)
by: Sun, Zhensu, et al.
Published: (2024)
GitChameleon 2.0: Evaluating AI Code Generation Against Python Library Version Incompatibilities
by: Misra, Diganta, et al.
Published: (2025)
by: Misra, Diganta, et al.
Published: (2025)
Benchmarking Large Language Models for ABAP Code Generation: An Empirical Study on Iterative Improvement by Compiler Feedback
by: Wallraven, Stephan, et al.
Published: (2026)
by: Wallraven, Stephan, et al.
Published: (2026)
Agentic Code Reasoning
by: Ugare, Shubham, et al.
Published: (2026)
by: Ugare, Shubham, et al.
Published: (2026)
ECO: Enhanced Code Optimization via Performance-Aware Prompting for Code-LLMs
by: Kim, Su-Hyeon, et al.
Published: (2025)
by: Kim, Su-Hyeon, et al.
Published: (2025)
REINFOREST: Reinforcing Semantic Code Similarity for Cross-Lingual Code Search Models
by: Saieva, Anthony, et al.
Published: (2023)
by: Saieva, Anthony, et al.
Published: (2023)
Once4All: Skeleton-Guided SMT Solver Fuzzing with LLM-Synthesized Generators
by: Sun, Maolin, et al.
Published: (2025)
by: Sun, Maolin, et al.
Published: (2025)
Assessing Code Understanding in LLMs
by: Laneve, Cosimo, et al.
Published: (2025)
by: Laneve, Cosimo, et al.
Published: (2025)
Ranking LLM-Generated Loop Invariants for Program Verification
by: Chakraborty, Saikat, et al.
Published: (2023)
by: Chakraborty, Saikat, et al.
Published: (2023)
AI-Mediated Code Comment Improvement
by: Dhakal, Maria, et al.
Published: (2025)
by: Dhakal, Maria, et al.
Published: (2025)
CodeMind: Evaluating Large Language Models for Code Reasoning
by: Liu, Changshu, et al.
Published: (2024)
by: Liu, Changshu, et al.
Published: (2024)
LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
by: Hind, Michael, et al.
Published: (2026)
by: Hind, Michael, et al.
Published: (2026)
AInsteinBench: Benchmarking Coding Agents on Scientific Repositories
by: Duston, Titouan, et al.
Published: (2025)
by: Duston, Titouan, et al.
Published: (2025)
Similar Items
-
ChatDBG: Augmenting Debugging with Large Language Models
by: Levin, Kyla H., et al.
Published: (2024) -
CoverUp: Effective High Coverage Test Generation for Python
by: Pizzorno, Juan Altmayer, et al.
Published: (2024) -
Dynamic Stability of LLM-Generated Code
by: Rajput, Prateek, et al.
Published: (2025) -
Beyond Code Pairs: Dialogue-Based Data Generation for LLM Code Translation
by: Chen, Le, et al.
Published: (2025) -
Executing as You Generate: Hiding Execution Latency in LLM Code Generation
by: Sun, Zhensu, et al.
Published: (2026)