Saved in:
| Main Author: | Sharma, Asankhaya |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.18521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Patched RTC: evaluating LLMs for diverse software development tasks
by: Sharma, Asankhaya
Published: (2024)
by: Sharma, Asankhaya
Published: (2024)
PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software
by: Li, Kaixuan, et al.
Published: (2024)
by: Li, Kaixuan, et al.
Published: (2024)
Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development
by: Palmblad, Magnus, et al.
Published: (2026)
by: Palmblad, Magnus, et al.
Published: (2026)
Private GPTs for LLM-driven testing in software development and machine learning
by: Jagielski, Jakub, et al.
Published: (2025)
by: Jagielski, Jakub, et al.
Published: (2025)
The importance of visual modelling languages in generative software engineering
by: Rossi, Roberto
Published: (2024)
by: Rossi, Roberto
Published: (2024)
RiskBridge: Turning CVEs into Business-Aligned Patch Priorities
by: Sheikh, Yelena Mujibur, et al.
Published: (2026)
by: Sheikh, Yelena Mujibur, et al.
Published: (2026)
PELLI: Framework to effectively integrate LLMs for quality software generation
by: Krebs, Rasmus, et al.
Published: (2026)
by: Krebs, Rasmus, et al.
Published: (2026)
From Patches to Trajectories: Privileged Process Supervision for Software-Engineering Agents
by: Ma, Murong, et al.
Published: (2026)
by: Ma, Murong, et al.
Published: (2026)
Learning From Developers: Towards Reliable Patch Validation at Scale for Linux
by: Lin, Chih-En, et al.
Published: (2026)
by: Lin, Chih-En, et al.
Published: (2026)
Assessing LLM code generation quality through path planning tasks
by: Chen, Wanyi, et al.
Published: (2025)
by: Chen, Wanyi, et al.
Published: (2025)
Exploring the extent of similarities in software failures across industries using LLMs
by: Detloff, Martin
Published: (2024)
by: Detloff, Martin
Published: (2024)
Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles
by: Vinh, Nguyen Phu, et al.
Published: (2025)
by: Vinh, Nguyen Phu, et al.
Published: (2025)
PyBench: Evaluating LLM Agent on various real-world coding tasks
by: Zhang, Yaolun, et al.
Published: (2024)
by: Zhang, Yaolun, et al.
Published: (2024)
Artificial intelligence for context-aware visual change detection in software test automation
by: Moradi, Milad, et al.
Published: (2024)
by: Moradi, Milad, et al.
Published: (2024)
Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments
by: Bolton, Regan, et al.
Published: (2025)
by: Bolton, Regan, et al.
Published: (2025)
Memory-Efficient Large Language Models for Program Repair with Semantic-Guided Patch Generation
by: Le-Cong, Thanh, et al.
Published: (2024)
by: Le-Cong, Thanh, et al.
Published: (2024)
InfCode: Adversarial Iterative Refinement of Tests and Patches for Reliable Software Issue Resolution
by: Li, KeFan, et al.
Published: (2025)
by: Li, KeFan, et al.
Published: (2025)
Co-PatcheR: Collaborative Software Patching with Component(s)-specific Small Reasoning Models
by: Tang, Yuheng, et al.
Published: (2025)
by: Tang, Yuheng, et al.
Published: (2025)
VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching
by: Zibaeirad, Arastoo, et al.
Published: (2024)
by: Zibaeirad, Arastoo, et al.
Published: (2024)
Automating Patch Set Generation from Code Review Comments Using Large Language Models
by: Rahman, Tajmilur, et al.
Published: (2024)
by: Rahman, Tajmilur, et al.
Published: (2024)
CodeXEmbed: A Generalist Embedding Model Family for Multiligual and Multi-task Code Retrieval
by: Liu, Ye, et al.
Published: (2024)
by: Liu, Ye, et al.
Published: (2024)
CMSA algorithm for solving the prioritized pairwise test data generation problem in software product lines
by: Ferrer, Javier, et al.
Published: (2024)
by: Ferrer, Javier, et al.
Published: (2024)
Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go?
by: Alfayez, Reem, et al.
Published: (2026)
by: Alfayez, Reem, et al.
Published: (2026)
RePaCA: Leveraging Reasoning Large Language Models for Static Automated Patch Correctness Assessment
by: Fuster-Pena, Marcos, et al.
Published: (2025)
by: Fuster-Pena, Marcos, et al.
Published: (2025)
CoCo-Bench: A Comprehensive Code Benchmark For Multi-task Large Language Model Evaluation
by: Yin, Wenjing, et al.
Published: (2025)
by: Yin, Wenjing, et al.
Published: (2025)
Evaluating Software Development Agents: Patch Patterns, Code Quality, and Issue Complexity in Real-World GitHub Scenarios
by: Chen, Zhi, et al.
Published: (2024)
by: Chen, Zhi, et al.
Published: (2024)
An explainable hybrid deep learning-enabled intelligent fault detection and diagnosis approach for automotive software systems validation
by: Abboush, Mohammad, et al.
Published: (2026)
by: Abboush, Mohammad, et al.
Published: (2026)
Machine Learning Experiences: A story of learning AI for use in enterprise software testing that can be used by anyone
by: Cohoon, Michael, et al.
Published: (2025)
by: Cohoon, Michael, et al.
Published: (2025)
ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files
by: Sharma, Reshabh K
Published: (2026)
by: Sharma, Reshabh K
Published: (2026)
Why you shouldn't fully trust ChatGPT: A synthesis of this AI tool's error rates across disciplines and the software engineering lifecycle
by: Garousi, Vahid
Published: (2025)
by: Garousi, Vahid
Published: (2025)
SpaceX: Exploring metrics with the SPACE model for developer productivity
by: Kaul, Sanchit, et al.
Published: (2025)
by: Kaul, Sanchit, et al.
Published: (2025)
Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities
by: Garg, Aayush, et al.
Published: (2025)
by: Garg, Aayush, et al.
Published: (2025)
Tu(r)ning AI Green: Exploring Energy Efficiency Cascading with Orthogonal Optimizations
by: Rajput, Saurabhsingh, et al.
Published: (2025)
by: Rajput, Saurabhsingh, et al.
Published: (2025)
DevLicOps: A Framework for Mitigating Licensing Risks in AI-Generated Code
by: Sharma, Pratyush Nidhi, et al.
Published: (2025)
by: Sharma, Pratyush Nidhi, et al.
Published: (2025)
Willful Disobedience: Automatically Detecting Failures in Agentic Traces
by: Sharma, Reshabh K, et al.
Published: (2026)
by: Sharma, Reshabh K, et al.
Published: (2026)
Learning Correct Behavior from Examples: Validating Sequential Execution in Autonomous Agents
by: Sharma, Reshabh K, et al.
Published: (2026)
by: Sharma, Reshabh K, et al.
Published: (2026)
Repository-Level Graph Representation Learning for Enhanced Security Patch Detection
by: Wen, Xin-Cheng, et al.
Published: (2024)
by: Wen, Xin-Cheng, et al.
Published: (2024)
Canonical Intermediate Representation for LLM-based optimization problem formulation and code generation
by: Lyu, Zhongyuan, et al.
Published: (2026)
by: Lyu, Zhongyuan, et al.
Published: (2026)
ML-Dev-Bench: Comparative Analysis of AI Agents on ML development workflows
by: Padigela, Harshith, et al.
Published: (2025)
by: Padigela, Harshith, et al.
Published: (2025)
Empirical Study of Code Large Language Models for Binary Security Patch Detection
by: Li, Qingyuan, et al.
Published: (2025)
by: Li, Qingyuan, et al.
Published: (2025)
Similar Items
-
Patched RTC: evaluating LLMs for diverse software development tasks
by: Sharma, Asankhaya
Published: (2024) -
PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software
by: Li, Kaixuan, et al.
Published: (2024) -
Agentic AI-assisted coding offers a unique opportunity to instill epistemic grounding during software development
by: Palmblad, Magnus, et al.
Published: (2026) -
Private GPTs for LLM-driven testing in software development and machine learning
by: Jagielski, Jakub, et al.
Published: (2025) -
The importance of visual modelling languages in generative software engineering
by: Rossi, Roberto
Published: (2024)