:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rodriguez-Cardenas, Daniel, Velasco, Alejandro, Poshyvanyk, Denys
Format:	Preprint
Published:	2025
Subjects:	Software Engineering Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2502.07046
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations
by: Palacio, David N., et al.
Published: (2024)

Toward a Theory of Causation for Interpreting Neural Code Models
by: Palacio, David N., et al.
Published: (2023)

Enabling Global, Human-Centered Explanations for LLMs:From Tokens to Interpretable Code and Test Generation
by: Khati, Dipin, et al.
Published: (2025)

How Propense Are Large Language Models at Producing Code Smells? A Benchmarking Study
by: Velasco, Alejandro, et al.
Published: (2024)

Detecting and Correcting Hallucinations in LLM-Generated Code via Deterministic AST Analysis
by: Khati, Dipin, et al.
Published: (2026)

Tricky$^2$: Towards a Benchmark for Evaluating Human and LLM Error Interactions
by: Granger, Cole, et al.
Published: (2026)

Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach
by: Yang, Hua, et al.
Published: (2025)

How Do Semantically Equivalent Code Transformations Impact Membership Inference on LLMs for Code?
by: Yang, Hua, et al.
Published: (2025)

On Interpreting the Effectiveness of Unsupervised Software Traceability with Information Theory
by: Palacio, David N., et al.
Published: (2024)

Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code?
by: Velasco, Alejandro, et al.
Published: (2024)

Toward Explaining Large Language Models in Software Engineering Tasks
by: Vitale, Antonio, et al.
Published: (2025)

Toward Neurosymbolic Program Comprehension
by: Velasco, Alejandro, et al.
Published: (2025)

Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering
by: Rodriguez-Cardenas, Daniel, et al.
Published: (2026)

On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing
by: Kovrigin, Alexander, et al.
Published: (2024)

Measuring Emergent Capabilities of LLMs for Software Engineering: How Far Are We?
by: O'Brien, Conor, et al.
Published: (2024)

Evaluating the Use of LLMs for Documentation to Code Traceability
by: Alor, Ebube, et al.
Published: (2025)

Mapping the Trust Terrain: LLMs in Software Engineering -- Insights and Perspectives
by: Khati, Dipin, et al.
Published: (2025)

Relative Positioning Based Code Chunking Method For Rich Context Retrieval In Repository Level Code Completion Task With Code Language Model
by: Rahman, Imranur, et al.
Published: (2025)

A Causal Perspective on Measuring, Explaining and Mitigating Smells in LLM-Generated Code
by: Velasco, Alejandro, et al.
Published: (2025)

An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code
by: Vulićević, Jelena Ilić
Published: (2026)

LiCoEval: Evaluating LLMs on License Compliance in Code Generation
by: Xu, Weiwei, et al.
Published: (2024)

A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models
by: Wu, Yixi, et al.
Published: (2024)

On LLMs' Internal Representation of Code Correctness
by: Ribeiro, Francisco, et al.
Published: (2025)

Operational Robustness of LLMs on Code Generation
by: Paul, Debalina Ghosh, et al.
Published: (2026)

CodeTaste: Can LLMs Generate Human-Level Code Refactorings?
by: Thillen, Alex, et al.
Published: (2026)

Automating Code Adaptation for MLOps -- A Benchmarking Study on LLMs
by: Patel, Harsh, et al.
Published: (2024)

Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code
by: Galimzyanov, Timur, et al.
Published: (2024)

"Don't Be Afraid, Just Learn": Insights from Industry Practitioners to Prepare Software Engineers in the Age of Generative AI
by: Otten, Daniel, et al.
Published: (2026)

Repo2Run: Automated Building Executable Environment for Code Repository at Scale
by: Hu, Ruida, et al.
Published: (2025)

GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization
by: Wang, Juntong, et al.
Published: (2026)

DRAGON: Robust Classification for Very Large Collections of Software Repositories
by: Balla, Stefano, et al.
Published: (2026)

Synergizing LLMs and Knowledge Graphs: A Novel Approach to Software Repository-Related Question Answering
by: Abedu, Samuel, et al.
Published: (2024)

Free and Customizable Code Documentation with LLMs: A Fine-Tuning Approach
by: Chakrabarty, Sayak, et al.
Published: (2024)

LLMs in Coding and their Impact on the Commercial Software Engineering Landscape
by: Belozerov, Vladislav, et al.
Published: (2025)

Protocode: Prototype-Driven Interpretability for Code Generation in LLMs
by: Bodla, Krishna Vamshi, et al.
Published: (2025)

The Struggles of LLMs in Cross-lingual Code Clone Detection
by: Moumoula, Micheline Bénédicte, et al.
Published: (2024)

Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
by: Gong, Linyuan, et al.
Published: (2024)

Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve
by: Liu, Yuanzhe, et al.
Published: (2025)

LLM-based Content Classification Approach for GitHub Repositories by the README Files
by: Mehmood, Malik Uzair, et al.
Published: (2025)

CSR-Bench: Benchmarking LLM Agents in Deployment of Computer Science Research Repositories
by: Xiao, Yijia, et al.
Published: (2025)