:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Haque, Md Nazmul, Lin, Elizabeth, Arkoh, Lawrence, Tadesse, Biruk, Xu, Bowen
Format:	Preprint
Published:	2025
Subjects:	Software Engineering
Online Access:	https://arxiv.org/abs/2512.08213
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

How Quantization Impacts Privacy Risk on LLMs for Code?
by: Haque, Md Nazmul, et al.
Published: (2025)

Where are the Hidden Gems? Applying Transformer Models for Design Discussion Detection
by: Arkoh, Lawrence, et al.
Published: (2026)

How Do Semantically Equivalent Code Transformations Impact Membership Inference on LLMs for Code?
by: Yang, Hua, et al.
Published: (2025)

Your ATs to Ts: MITRE ATT&CK Attack Technique to P-SSCRM Task Mapping
by: Hamer, Sivana, et al.
Published: (2025)

Capturing the Effects of Quantization on Trojans in Code LLMs
by: Hussain, Aftab, et al.
Published: (2025)

Closing the Chain: How to reduce your risk of being SolarWinds, Log4j, or XZ Utils
by: Hamer, Sivana, et al.
Published: (2025)

Beyond Single Reports: Evaluating Automated ATT&CK Technique Extraction in Multi-Report Campaign Settings
by: Haque, Md Nazmul, et al.
Published: (2026)

LLMs: A Game-Changer for Software Engineers?
by: Haque, Md Asraful
Published: (2024)

Assessing the Bug-Proneness of Refactored Code: A Longitudinal Multi-Project Study
by: Ferreira, Isabella, et al.
Published: (2025)

Relating Complexity, Explicitness, Effectiveness of Refactorings and Non-Functional Requirements: A Replication Study
by: Soares, Vinícius, et al.
Published: (2025)

Code Quality Analysis of Translations from C to Rust
by: Tadesse, Biruk, et al.
Published: (2026)

CIgrate: Automating CI Service Migration with Large Language Models
by: Hossain, Md Nazmul, et al.
Published: (2025)

SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations
by: Tonon, Andrea, et al.
Published: (2024)

Smaller = Weaker? Benchmarking Robustness of Quantized LLMs in Code Generation
by: Fang, Sen, et al.
Published: (2025)

SOK: Exploring Hallucinations and Security Risks in AI-Assisted Software Development with Insights for LLM Deployment
by: Haque, Ariful, et al.
Published: (2025)

Is Quantization a Deal-breaker? Empirical Insights from Large Code Models
by: Afrin, Saima, et al.
Published: (2025)

We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
by: Spracklen, Joseph, et al.
Published: (2024)

CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification
by: Tian, Yuchen, et al.
Published: (2024)

Hallucination to Consensus: Multi-Agent LLMs for End-to-End JUnit Test Generation
by: Xu, Qinghua, et al.
Published: (2025)

A Systematic Literature Review of Parameter-Efficient Fine-Tuning for Large Code Models
by: Afrin, Saima, et al.
Published: (2025)

Parameter-Efficient Multi-Task Fine-Tuning in Code-Related Tasks
by: Haque, Md Zahidul, et al.
Published: (2026)

Assertion Messages with Large Language Models (LLMs) for Code
by: Aljohani, Ahmed, et al.
Published: (2025)

RACONTEUR: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer
by: Deng, Jiangyi, et al.
Published: (2024)

Understanding npm Developers' Practices, Challenges, and Recommendations for Secure Package Development
by: Peruma, Anthony, et al.
Published: (2026)

Mind the Gap: Evaluating LLMs for High-Level Malicious Package Detection vs. Fine-Grained Indicator Identification
by: Ryan, Ahmed, et al.
Published: (2026)

Using LLMs for Security Advisory Investigations: How Far Are We?
by: Abdullah, Bayu Fedra, et al.
Published: (2025)

PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring
by: Liu, Xiting, et al.
Published: (2026)

Mapping of the system of software-related emissions and shared responsibilities
by: Partanen, Laura, et al.
Published: (2025)

HFuzzer: Testing Large Language Models for Package Hallucinations via Phrase-based Fuzzing
by: Zhao, Yukai, et al.
Published: (2025)

Securing the Software Package Supply Chain for Critical Systems
by: Murali, Ritwik, et al.
Published: (2025)

Code Comprehension with GitHub Copilot: Performance Gains, Comprehension Trade-offs, and Behavioral Predictors in Brownfield Programming
by: Qiao, Yunhan, et al.
Published: (2025)

The Devil Is in the Command Line: Associating the Compiler Flags With the Binary and Build Metadata
by: Kudrjavets, Gunnar, et al.
Published: (2023)

Towards Mitigating API Hallucination in Code Generated by LLMs with Hierarchical Dependency Aware
by: Chen, Yujia, et al.
Published: (2025)

Detecting Malicious Source Code in PyPI Packages with LLMs: Does RAG Come in Handy?
by: Ibiyo, Motunrayo, et al.
Published: (2025)

Benchmarking ChatGPT, Codeium, and GitHub Copilot: A Comparative Study of AI-Driven Programming and Debugging Assistants
by: Ovi, Md Sultanul Islam, et al.
Published: (2024)

PyGen: A Collaborative Human-AI Approach to Python Package Creation
by: Barua, Saikat, et al.
Published: (2024)

ShellFuzzer: Grammar-based Fuzzing of Shell Interpreters
by: Felici, Riccardo, et al.
Published: (2024)

The Popularity Hypothesis in Software Security: A Large-Scale Replication with PHP Packages
by: Ruohonen, Jukka, et al.
Published: (2025)

Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs
by: Haque, Mirazul, et al.
Published: (2025)

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
by: Merrill, Mike A., et al.
Published: (2026)