:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sahoo, Devanshu, Majhi, Vasudev, Neekhra, Arjun, Sinha, Yash, Mandal, Murari, Kumar, Dhruv
Format:	Preprint
Published:	2025
Subjects:	Software Engineering Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.10415
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Compliance Paradox: Semantic-Instruction Decoupling in Automated Academic Code Evaluation
by: Sahoo, Devanshu, et al.
Published: (2026)

When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection
by: Sahoo, Devanshu, et al.
Published: (2025)

Your Build Scripts Stink: The State of Code Smells in Build Scripts
by: Tamanna, Mahzabin, et al.
Published: (2025)

Smoke and Mirrors: Jailbreaking LLM-based Code Generation via Implicit Malicious Prompts
by: Ouyang, Sheng, et al.
Published: (2025)

Fuzzing with Agents? Generators Are All You Need
by: Vikram, Vasudev, et al.
Published: (2026)

The Hidden Cost of Readability: How Code Formatting Silently Consumes Your LLM Budget
by: Pan, Dangfeng, et al.
Published: (2025)

A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How
by: Wang, Chaozheng, et al.
Published: (2024)

Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval
by: Wang, Jiexin, et al.
Published: (2024)

Leveraging Large Language Models to Improve REST API Testing
by: Kim, Myeongsoo, et al.
Published: (2023)

Grounded AI for Code Review: Resource-Efficient Large-Model Serving in Enterprise Pipelines
by: Mandal, Sayan, et al.
Published: (2025)

On the Freshness of Pinned Dependencies in Maven
by: Vikram, Vasudev, et al.
Published: (2025)

Modeling and Recovering Hierarchical Structural Architectures of ROS 2 Systems from Code and Launch Configurations using LLM-based Agents
by: Benchat, Mohamed, et al.
Published: (2026)

Context-Aware CodeLLM Eviction for AI-assisted Coding
by: Thangarajah, Kishanthan, et al.
Published: (2025)

CR-Bench: Evaluating the Real-World Utility of AI Code Review Agents
by: Pereira, Kristen, et al.
Published: (2026)

Rubric Is All You Need: Enhancing LLM-based Code Evaluation With Question-Specific Rubrics
by: Pathak, Aditya, et al.
Published: (2025)

Evaluating Large Language Models for Functional and Maintainable Code in Industrial Settings: A Case Study at ASML
by: Mundhra, Yash, et al.
Published: (2025)

CodeArena: A Collective Evaluation Platform for LLM Code Generation
by: Du, Mingzhe, et al.
Published: (2025)

Intention is All You Need: Refining Your Code from Your Intention
by: Guo, Qi, et al.
Published: (2025)

LLM-as-a-Judge for Human-AI Co-Creation: A Reliability-Aware Evaluation Framework for Coding
by: Amin, Md Faizul Ibne, et al.
Published: (2026)

Can Old Tests Do New Tricks for Resolving SWE Issues?
by: Chen, Yang, et al.
Published: (2025)

WIP: Leveraging LLMs for Enforcing Design Principles in Student Code: Analysis of Prompting Strategies and RAG
by: Kolhatkar, Dhruv, et al.
Published: (2025)

These Aren't the Reviews You're Looking For How Humans Review AI-Generated Pull Requests
by: Duma, Kacper, et al.
Published: (2026)

Enhancing LLM Code Generation: A Systematic Evaluation of Multi-Agent Collaboration and Runtime Debugging for Improved Accuracy, Reliability, and Latency
by: Ashrafi, Nazmus, et al.
Published: (2025)

Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing
by: Dhruv, Akash, et al.
Published: (2024)

Can Large Language Models Write Good Property-Based Tests?
by: Vikram, Vasudev, et al.
Published: (2023)

Unit Test Generation using Generative AI : A Comparative Performance Analysis of Autogeneration Tools
by: Bhatia, Shreya, et al.
Published: (2023)

Adding New Capability in Existing Scientific Application with LLM Assistance
by: Dubey, Anshu, et al.
Published: (2025)

"Your AI, My Shell": Demystifying Prompt Injection Attacks on Agentic AI Coding Editors
by: Liu, Yue, et al.
Published: (2025)

Comment Traps: How Defective Commented-out Code Augment Defects in AI-Assisted Code Generation
by: Huang, Yuan, et al.
Published: (2025)

"I Would Have Written My Code Differently'': Beginners Struggle to Understand LLM-Generated Code
by: Zi, Yangtian, et al.
Published: (2025)

Evaluating LLM-Generated Code: A Benchmark and Developer Study
by: Szych, Joanna, et al.
Published: (2026)

Copilot Arena: A Platform for Code LLM Evaluation in the Wild
by: Chi, Wayne, et al.
Published: (2025)

Evaluating Efficiency and Novelty of LLM-Generated Code for Graph Analysis
by: Nia, Atieh Barati, et al.
Published: (2025)

TRACE: Evaluating Execution Efficiency of LLM-Based Code Translation
by: Gong, Zhihao, et al.
Published: (2026)

TRACE: Evaluating Execution Efficiency of LLM-Based Code Translation
by: Gong, Zhihao, et al.
Published: (2025)

A Survey of Code Review Benchmarks and Evaluation Practices in Pre-LLM and LLM Era
by: Khan, Taufiqul Islam, et al.
Published: (2026)

Gendered Prompting and LLM Code Review: How Gender Cues in the Prompt Shape Code Quality and Evaluation
by: Janzen, Lynn, et al.
Published: (2026)

How to Compare the Security of Code Written by Humans to LLM-generated Code
by: Balebako, Rebecca, et al.
Published: (2026)

Code Roulette: How Prompt Variability Affects LLM Code Generation
by: Paleyes, Andrei, et al.
Published: (2025)

Code Review Automation Via Multi-task Federated LLM -- An Empirical Study
by: Kumar, Jahnavi, et al.
Published: (2024)