Saved in:
| Main Author: | Oliveira, William |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.24636 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
by: Zhou, Xin, et al.
Published: (2025)
by: Zhou, Xin, et al.
Published: (2025)
State-of-the-art Small Language Coder Model: Mify-Coder
by: Parmar, Abhinav, et al.
Published: (2025)
by: Parmar, Abhinav, et al.
Published: (2025)
Does Model Size Matter? A Comparison of Small and Large Language Models for Requirements Classification
by: Zadenoori, Mohammad Amin, et al.
Published: (2025)
by: Zadenoori, Mohammad Amin, et al.
Published: (2025)
AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
by: Ma, Lipeng, et al.
Published: (2025)
by: Ma, Lipeng, et al.
Published: (2025)
SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
by: Kon, Patrick Tser Jern, et al.
Published: (2026)
Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code
by: Zhang, Ziyin, et al.
Published: (2023)
by: Zhang, Ziyin, et al.
Published: (2023)
EHR-Based Mobile and Web Platform for Chronic Disease Risk Prediction Using Large Language Multimodal Models
by: Liao, Chun-Chieh, et al.
Published: (2024)
by: Liao, Chun-Chieh, et al.
Published: (2024)
Testing the Effect of Code Documentation on Large Language Model Code Understanding
by: Macke, William, et al.
Published: (2024)
by: Macke, William, et al.
Published: (2024)
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
by: Jin, Haolin, et al.
Published: (2024)
by: Jin, Haolin, et al.
Published: (2024)
SWE-MERA: A Dynamic Benchmark for Agenticly Evaluating Large Language Models on Software Engineering Tasks
by: Adamenko, Pavel, et al.
Published: (2025)
by: Adamenko, Pavel, et al.
Published: (2025)
Prompt Less, Smile More: MTP with Semantic Engineering in Lieu of Prompt Engineering
by: Dantanarayana, Jayanaka L., et al.
Published: (2025)
by: Dantanarayana, Jayanaka L., et al.
Published: (2025)
FeatBench: Towards More Realistic Evaluation of Feature-level Code Generation
by: Chen, Haorui, et al.
Published: (2025)
by: Chen, Haorui, et al.
Published: (2025)
AutoIOT: LLM-Driven Automated Natural Language Programming for AIoT Applications
by: Shen, Leming, et al.
Published: (2025)
by: Shen, Leming, et al.
Published: (2025)
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
by: Zhuo, Terry Yue, et al.
Published: (2025)
by: Zhuo, Terry Yue, et al.
Published: (2025)
Introduction to Analytical Software Engineering Design Paradigm
by: Houichime, Tarik, et al.
Published: (2025)
by: Houichime, Tarik, et al.
Published: (2025)
MCP-Solver: Integrating Language Models with Constraint Programming Systems
by: Szeider, Stefan
Published: (2024)
by: Szeider, Stefan
Published: (2024)
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
by: Zhang, Kexun, et al.
Published: (2024)
by: Zhang, Kexun, et al.
Published: (2024)
Advancing Language Models for Code-related Tasks
by: Tian, Zhao
Published: (2026)
by: Tian, Zhao
Published: (2026)
Revisiting the Reliability of Language Models in Instruction-Following
by: Dong, Jianshuo, et al.
Published: (2025)
by: Dong, Jianshuo, et al.
Published: (2025)
Text2BIM: Generating Building Models Using a Large Language Model-based Multi-Agent Framework
by: Du, Changyu, et al.
Published: (2024)
by: Du, Changyu, et al.
Published: (2024)
Agents in Software Engineering: Survey, Landscape, and Vision
by: Wang, Yanlin, et al.
Published: (2024)
by: Wang, Yanlin, et al.
Published: (2024)
ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages
by: Kammakomati, Mehant, et al.
Published: (2024)
by: Kammakomati, Mehant, et al.
Published: (2024)
Mitigating Gender Bias in Code Large Language Models via Model Editing
by: Qin, Zhanyue, et al.
Published: (2024)
by: Qin, Zhanyue, et al.
Published: (2024)
A Survey on Large Language Models for Code Generation
by: Jiang, Juyong, et al.
Published: (2024)
by: Jiang, Juyong, et al.
Published: (2024)
Analyzing the Performance of Large Language Models on Code Summarization
by: Haldar, Rajarshi, et al.
Published: (2024)
by: Haldar, Rajarshi, et al.
Published: (2024)
Effective Harness Engineering for Algorithm Discovery with Coding Agents
by: Ishibashi, Yoichi, et al.
Published: (2026)
by: Ishibashi, Yoichi, et al.
Published: (2026)
Dialogue Systems Engineering: A Survey and Future Directions
by: Nakano, Mikio, et al.
Published: (2025)
by: Nakano, Mikio, et al.
Published: (2025)
SWE-smith: Scaling Data for Software Engineering Agents
by: Yang, John, et al.
Published: (2025)
by: Yang, John, et al.
Published: (2025)
Crystal: Illuminating LLM Abilities on Language and Code
by: Tao, Tianhua, et al.
Published: (2024)
by: Tao, Tianhua, et al.
Published: (2024)
ICE-Score: Instructing Large Language Models to Evaluate Code
by: Zhuo, Terry Yue
Published: (2023)
by: Zhuo, Terry Yue
Published: (2023)
CodeMirage: Hallucinations in Code Generated by Large Language Models
by: Agarwal, Vibhor, et al.
Published: (2024)
by: Agarwal, Vibhor, et al.
Published: (2024)
A Code Comprehension Benchmark for Large Language Models for Code
by: Havare, Jayant, et al.
Published: (2025)
by: Havare, Jayant, et al.
Published: (2025)
Exploring Language Model's Code Generation Ability with Auxiliary Functions
by: Lee, Seonghyeon, et al.
Published: (2024)
by: Lee, Seonghyeon, et al.
Published: (2024)
DebugBench: Evaluating Debugging Capability of Large Language Models
by: Tian, Runchu, et al.
Published: (2024)
by: Tian, Runchu, et al.
Published: (2024)
OmniCode: A Benchmark for Evaluating Software Engineering Agents
by: Sonwane, Atharv, et al.
Published: (2026)
by: Sonwane, Atharv, et al.
Published: (2026)
Jailbreaking ChatGPT via Prompt Engineering: An Empirical Study
by: Liu, Yi, et al.
Published: (2023)
by: Liu, Yi, et al.
Published: (2023)
Value-Conflict Diagnostics Reveal Widespread Alignment Faking in Language Models
by: Nair, Inderjeet, et al.
Published: (2026)
by: Nair, Inderjeet, et al.
Published: (2026)
Exploring Data-Efficient Adaptation of Large Language Models for Code Generation
by: Jiang, Xue, et al.
Published: (2024)
by: Jiang, Xue, et al.
Published: (2024)
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models
by: Zhuo, Terry Yue, et al.
Published: (2024)
by: Zhuo, Terry Yue, et al.
Published: (2024)
How Effective are Generative Large Language Models in Performing Requirements Classification?
by: Alhoshan, Waad, et al.
Published: (2025)
by: Alhoshan, Waad, et al.
Published: (2025)
Similar Items
-
LessLeak-Bench: A First Investigation of Data Leakage in LLMs Across 83 Software Engineering Benchmarks
by: Zhou, Xin, et al.
Published: (2025) -
State-of-the-art Small Language Coder Model: Mify-Coder
by: Parmar, Abhinav, et al.
Published: (2025) -
Does Model Size Matter? A Comparison of Small and Large Language Models for Requirements Classification
by: Zadenoori, Mohammad Amin, et al.
Published: (2025) -
AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
by: Ma, Lipeng, et al.
Published: (2025) -
SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents
by: Kon, Patrick Tser Jern, et al.
Published: (2026)