:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autor principal:	Ashraf, Taniv
Formato:	Preprint
Publicado:	2025
Materias:	Software Engineering Artificial Intelligence I.2.7; J.1
Acceso en línea:	https://arxiv.org/abs/2507.09583
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation
por: Chen, Mu-Chi, et al.
Publicado: (2026)

Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation
por: Karpurapu, Shanthi, et al.
Publicado: (2024)

Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
por: Le, Nguyen-Khang, et al.
Publicado: (2025)

Leveraging Large Language Models for Use Case Model Generation from Software Requirements
por: Eisenreich, Tobias, et al.
Publicado: (2025)

A Systematic Approach for Assessing Large Language Models' Test Case Generation Capability
por: Chang, Hung-Fu, et al.
Publicado: (2025)

Benchmarking Energy Efficiency of Large Language Models Using vLLM
por: Pronk, K., et al.
Publicado: (2025)

The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot
por: Yeverechyahu, Doron, et al.
Publicado: (2024)

Inference-Time Intervention in Large Language Models for Reliable Requirement Verification
por: Darm, Paul, et al.
Publicado: (2025)

Learning Software Bug Reports: A Systematic Literature Review
por: Long, Guoming, et al.
Publicado: (2025)

Distilling Desired Comments for Enhanced Code Review with Large Language Models
por: Yu, Yongda, et al.
Publicado: (2024)

Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development
por: Tran, Hung, et al.
Publicado: (2026)

A Framework for Testing and Adapting REST APIs as LLM Tools
por: Bandlamudi, Jayachandu, et al.
Publicado: (2025)

Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework
por: Le, Nguyen-Khang, et al.
Publicado: (2025)

Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents
por: Cartagena, Arnold, et al.
Publicado: (2026)

Mechanistic Understanding of Language Models in Syntactic Code Completion
por: Miller, Samuel, et al.
Publicado: (2025)

CIDR: A Large-Scale Industrial Source Code Dataset for Software Engineering Research
por: Savenkov, Vladislav
Publicado: (2026)

Automated Bug Triaging using Instruction-Tuned Large Language Models
por: Kiashemshaki, Kiana, et al.
Publicado: (2025)

Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
por: Rai, Daking, et al.
Publicado: (2025)

TCProF: Time-Complexity Prediction SSL Framework
por: Hahn, Joonghyuk, et al.
Publicado: (2025)

MEC$^3$O: Multi-Expert Consensus for Code Time Complexity Prediction
por: Hahn, Joonghyuk, et al.
Publicado: (2025)

Generative AI Toolkit -- a framework for increasing the quality of LLM-based applications over their whole life cycle
por: Kohl, Jens, et al.
Publicado: (2024)

ContractBench: Can LLM Agents Preserve Observation Contracts?
por: Wang, Jicheng, et al.
Publicado: (2026)

FREYR: A Framework for Recognizing and Executing Your Requests
por: Gallotta, Roberto, et al.
Publicado: (2025)

Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards
por: Groeneveld, Jan Niklas, et al.
Publicado: (2025)

Source Code Summarization in the Era of Large Language Models
por: Sun, Weisong, et al.
Publicado: (2024)

AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling
por: Huang, Yuheng, et al.
Publicado: (2024)

An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification
por: More, Riddhi, et al.
Publicado: (2025)

Beyond Greenfield: The D3 Framework for AI-Driven Productivity in Brownfield Engineering
por: Sharma, Krishna Kumaar
Publicado: (2025)

ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation
por: Lim, Soohan, et al.
Publicado: (2025)

Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets
por: Sakizli, Furkan
Publicado: (2026)

VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation
por: Miculicich, Lesly, et al.
Publicado: (2025)

Narrow Transformer: StarCoder-Based Java-LM For Desktop
por: Rathinasamy, Kamalkumar, et al.
Publicado: (2024)

A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation
por: Bassamzadeh, Nastaran, et al.
Publicado: (2024)

Prompt Engineering Strategies for LLM-based Qualitative Coding of Psychological Safety in Software Engineering Communities: A Controlled Empirical Study
por: Alshaikh, Moaath, et al.
Publicado: (2026)

Developer Challenges on Large Language Models: A Study of Stack Overflow and OpenAI Developer Forum Posts
por: Alam, Khairul, et al.
Publicado: (2024)

Assessing Data Augmentation-Induced Bias in Training and Testing of Machine Learning Models
por: More, Riddhi, et al.
Publicado: (2025)

RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code
por: Chen, Jiachi, et al.
Publicado: (2024)

Collaborative LLM Agents for C4 Software Architecture Design Automation
por: Szczepanik, Kamil, et al.
Publicado: (2025)

Toward Architecture-Aware Evaluation Metrics for LLM Agents
por: Souza, Débora, et al.
Publicado: (2026)

When Retrieval Hurts Code Completion: A Diagnostic Study of Stale Repository Context
por: Weng, Haojun, et al.
Publicado: (2026)