Saved in:
| Main Authors: | Fan, Guodong, Gao, Cuiyun, Chong, Chun Yong, Zhang, Lu, Li, Jing, Zhang, Jinglin, Chen, Shizhan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.26686 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Systematic Literature Review of Code Hallucinations in LLMs: Characterization, Mitigation Methods, Challenges, and Future Directions for Reliable AI
by: Gao, Cuiyun, et al.
Published: (2025)
by: Gao, Cuiyun, et al.
Published: (2025)
Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering
by: Wang, Ruiqi, et al.
Published: (2025)
by: Wang, Ruiqi, et al.
Published: (2025)
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How
by: Wang, Chaozheng, et al.
Published: (2024)
by: Wang, Chaozheng, et al.
Published: (2024)
ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code
by: Feng, Jia, et al.
Published: (2024)
by: Feng, Jia, et al.
Published: (2024)
APIGen: Generative API Method Recommendation
by: Chen, Yujia, et al.
Published: (2024)
by: Chen, Yujia, et al.
Published: (2024)
AXIOM: Benchmarking LLM-as-a-Judge for Code via Rule-Based Perturbation and Multisource Quality Calibration
by: Wang, Ruiqi, et al.
Published: (2025)
by: Wang, Ruiqi, et al.
Published: (2025)
When Large Language Models Meet UAV Projects: An Empirical Study from Developers' Perspective
by: Chen, Yihua, et al.
Published: (2025)
by: Chen, Yihua, et al.
Published: (2025)
When Shared Worlds Break: Demystifying Defects in Multi-User Extended Reality Software Systems
by: Li, Shuqing, et al.
Published: (2025)
by: Li, Shuqing, et al.
Published: (2025)
Cascaded Code Editing: Large-Small Model Collaboration for Effective and Efficient Code Editing
by: Wang, Chaozheng, et al.
Published: (2026)
by: Wang, Chaozheng, et al.
Published: (2026)
Bridge and Hint: Extending Pre-trained Language Models for Long-Range Code
by: Chen, Yujia, et al.
Published: (2024)
by: Chen, Yujia, et al.
Published: (2024)
Smaller but Better: Self-Paced Knowledge Distillation for Lightweight yet Effective LCMs
by: Chen, Yujia, et al.
Published: (2024)
by: Chen, Yujia, et al.
Published: (2024)
The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation
by: Gao, Shuzheng, et al.
Published: (2025)
by: Gao, Shuzheng, et al.
Published: (2025)
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
by: Liu, Zeyu Leo, et al.
Published: (2024)
by: Liu, Zeyu Leo, et al.
Published: (2024)
LLMs Meet Library Evolution: Evaluating Deprecated API Usage in LLM-based Code Completion
by: Wang, Chong, et al.
Published: (2024)
by: Wang, Chong, et al.
Published: (2024)
LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search
by: Chen, Yujia, et al.
Published: (2026)
by: Chen, Yujia, et al.
Published: (2026)
When Automated Program Repair Meets Regression Testing -- An Extensive Study on 2 Million Patches
by: Lou, Yiling, et al.
Published: (2021)
by: Lou, Yiling, et al.
Published: (2021)
The Current Challenges of Software Engineering in the Era of Large Language Models
by: Gao, Cuiyun, et al.
Published: (2024)
by: Gao, Cuiyun, et al.
Published: (2024)
When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?
by: Chen, Chong, et al.
Published: (2023)
by: Chen, Chong, et al.
Published: (2023)
When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models
by: Zheng, Shenyu, et al.
Published: (2026)
by: Zheng, Shenyu, et al.
Published: (2026)
WITNESS: A lightweight and practical approach to fine-grained predictive mutation testing
by: Lu, Zeyu, et al.
Published: (2025)
by: Lu, Zeyu, et al.
Published: (2025)
An Empirical Study of Knowledge Distillation for Code Understanding Tasks
by: Wang, Ruiqi, et al.
Published: (2025)
by: Wang, Ruiqi, et al.
Published: (2025)
SR-Eval: Evaluating LLMs on Code Generation under Stepwise Requirement Refinement
by: Zhan, Zexun, et al.
Published: (2025)
by: Zhan, Zexun, et al.
Published: (2025)
MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution
by: Wu, Zihan, et al.
Published: (2026)
by: Wu, Zihan, et al.
Published: (2026)
SEER: Enhancing Chain-of-Thought Code Generation through Self-Exploring Deep Reasoning
by: Gao, Shuzheng, et al.
Published: (2025)
by: Gao, Shuzheng, et al.
Published: (2025)
LLM-Driven Kernel Evolution: Automating Driver Updates in Linux
by: Kharlamova, Arina, et al.
Published: (2025)
by: Kharlamova, Arina, et al.
Published: (2025)
What Makes Good In-context Demonstrations for Code Intelligence Tasks with LLMs?
by: Gao, Shuzheng, et al.
Published: (2023)
by: Gao, Shuzheng, et al.
Published: (2023)
Towards Mitigating API Hallucination in Code Generated by LLMs with Hierarchical Dependency Aware
by: Chen, Yujia, et al.
Published: (2025)
by: Chen, Yujia, et al.
Published: (2025)
Quantum Computing as a Service -- a Software Engineering Perspective
by: Ahmad, Aakash, et al.
Published: (2025)
by: Ahmad, Aakash, et al.
Published: (2025)
Lightweight Model Editing for LLMs to Correct Deprecated API Recommendations
by: Lin, Guancheng, et al.
Published: (2025)
by: Lin, Guancheng, et al.
Published: (2025)
LibRec: Benchmarking Retrieval-Augmented LLMs for Library Migration Recommendations
by: Han, Junxiao, et al.
Published: (2025)
by: Han, Junxiao, et al.
Published: (2025)
When Fuzzing Meets LLMs: Challenges and Opportunities
by: Jiang, Yu, et al.
Published: (2024)
by: Jiang, Yu, et al.
Published: (2024)
SPENCER: Self-Adaptive Model Distillation for Efficient Code Retrieval
by: Gu, Wenchao, et al.
Published: (2025)
by: Gu, Wenchao, et al.
Published: (2025)
Deep Learning Based Code Generation Methods: Literature Review
by: Yang, Zezhou, et al.
Published: (2023)
by: Yang, Zezhou, et al.
Published: (2023)
Learning in the Wild: Towards Leveraging Unlabeled Data for Effectively Tuning Pre-trained Code Models
by: Gao, Shuzheng, et al.
Published: (2024)
by: Gao, Shuzheng, et al.
Published: (2024)
Automatically Recommend Code Updates: Are We There Yet?
by: Liu, Yue, et al.
Published: (2022)
by: Liu, Yue, et al.
Published: (2022)
When AI Models Become Dependencies: Studying the Evolution of Pre-Trained Model Reuse in Downstream Software Systems
by: Banyongrakkul, Peerachai, et al.
Published: (2026)
by: Banyongrakkul, Peerachai, et al.
Published: (2026)
Mining Service Behavior for Stateful Service Emulation
by: Hossain, Md Arafat, et al.
Published: (2025)
by: Hossain, Md Arafat, et al.
Published: (2025)
Weakly Supervised Vulnerability Localization via Multiple Instance Learning
by: Gu, Wenchao, et al.
Published: (2025)
by: Gu, Wenchao, et al.
Published: (2025)
When Retriever Meets Generator: A Joint Model for Code Comment Generation
by: Le, Tien P. T., et al.
Published: (2025)
by: Le, Tien P. T., et al.
Published: (2025)
SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations
by: Tonon, Andrea, et al.
Published: (2024)
by: Tonon, Andrea, et al.
Published: (2024)
Similar Items
-
A Systematic Literature Review of Code Hallucinations in LLMs: Characterization, Mitigation Methods, Challenges, and Future Directions for Reliable AI
by: Gao, Cuiyun, et al.
Published: (2025) -
Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering
by: Wang, Ruiqi, et al.
Published: (2025) -
A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How
by: Wang, Chaozheng, et al.
Published: (2024) -
ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code
by: Feng, Jia, et al.
Published: (2024) -
APIGen: Generative API Method Recommendation
by: Chen, Yujia, et al.
Published: (2024)