Guardado en:
| Autor principal: | Ashraf, Taniv |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2507.09583 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation
por: Chen, Mu-Chi, et al.
Publicado: (2026)
por: Chen, Mu-Chi, et al.
Publicado: (2026)
Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation
por: Karpurapu, Shanthi, et al.
Publicado: (2024)
por: Karpurapu, Shanthi, et al.
Publicado: (2024)
Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
por: Le, Nguyen-Khang, et al.
Publicado: (2025)
por: Le, Nguyen-Khang, et al.
Publicado: (2025)
Leveraging Large Language Models for Use Case Model Generation from Software Requirements
por: Eisenreich, Tobias, et al.
Publicado: (2025)
por: Eisenreich, Tobias, et al.
Publicado: (2025)
A Systematic Approach for Assessing Large Language Models' Test Case Generation Capability
por: Chang, Hung-Fu, et al.
Publicado: (2025)
por: Chang, Hung-Fu, et al.
Publicado: (2025)
Benchmarking Energy Efficiency of Large Language Models Using vLLM
por: Pronk, K., et al.
Publicado: (2025)
por: Pronk, K., et al.
Publicado: (2025)
The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot
por: Yeverechyahu, Doron, et al.
Publicado: (2024)
por: Yeverechyahu, Doron, et al.
Publicado: (2024)
Inference-Time Intervention in Large Language Models for Reliable Requirement Verification
por: Darm, Paul, et al.
Publicado: (2025)
por: Darm, Paul, et al.
Publicado: (2025)
Learning Software Bug Reports: A Systematic Literature Review
por: Long, Guoming, et al.
Publicado: (2025)
por: Long, Guoming, et al.
Publicado: (2025)
Distilling Desired Comments for Enhanced Code Review with Large Language Models
por: Yu, Yongda, et al.
Publicado: (2024)
por: Yu, Yongda, et al.
Publicado: (2024)
Vibe Code Bench: Evaluating AI Models on End-to-End Web Application Development
por: Tran, Hung, et al.
Publicado: (2026)
por: Tran, Hung, et al.
Publicado: (2026)
A Framework for Testing and Adapting REST APIs as LLM Tools
por: Bandlamudi, Jayachandu, et al.
Publicado: (2025)
por: Bandlamudi, Jayachandu, et al.
Publicado: (2025)
Finetuning LLMs for Automatic Form Interaction on Web-Browser in Selenium Testing Framework
por: Le, Nguyen-Khang, et al.
Publicado: (2025)
por: Le, Nguyen-Khang, et al.
Publicado: (2025)
Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents
por: Cartagena, Arnold, et al.
Publicado: (2026)
por: Cartagena, Arnold, et al.
Publicado: (2026)
Mechanistic Understanding of Language Models in Syntactic Code Completion
por: Miller, Samuel, et al.
Publicado: (2025)
por: Miller, Samuel, et al.
Publicado: (2025)
CIDR: A Large-Scale Industrial Source Code Dataset for Software Engineering Research
por: Savenkov, Vladislav
Publicado: (2026)
por: Savenkov, Vladislav
Publicado: (2026)
Automated Bug Triaging using Instruction-Tuned Large Language Models
por: Kiashemshaki, Kiana, et al.
Publicado: (2025)
por: Kiashemshaki, Kiana, et al.
Publicado: (2025)
Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones
por: Rai, Daking, et al.
Publicado: (2025)
por: Rai, Daking, et al.
Publicado: (2025)
TCProF: Time-Complexity Prediction SSL Framework
por: Hahn, Joonghyuk, et al.
Publicado: (2025)
por: Hahn, Joonghyuk, et al.
Publicado: (2025)
MEC$^3$O: Multi-Expert Consensus for Code Time Complexity Prediction
por: Hahn, Joonghyuk, et al.
Publicado: (2025)
por: Hahn, Joonghyuk, et al.
Publicado: (2025)
Generative AI Toolkit -- a framework for increasing the quality of LLM-based applications over their whole life cycle
por: Kohl, Jens, et al.
Publicado: (2024)
por: Kohl, Jens, et al.
Publicado: (2024)
ContractBench: Can LLM Agents Preserve Observation Contracts?
por: Wang, Jicheng, et al.
Publicado: (2026)
por: Wang, Jicheng, et al.
Publicado: (2026)
FREYR: A Framework for Recognizing and Executing Your Requests
por: Gallotta, Roberto, et al.
Publicado: (2025)
por: Gallotta, Roberto, et al.
Publicado: (2025)
Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards
por: Groeneveld, Jan Niklas, et al.
Publicado: (2025)
por: Groeneveld, Jan Niklas, et al.
Publicado: (2025)
Source Code Summarization in the Era of Large Language Models
por: Sun, Weisong, et al.
Publicado: (2024)
por: Sun, Weisong, et al.
Publicado: (2024)
AcTracer: Active Testing of Large Language Model via Multi-Stage Sampling
por: Huang, Yuheng, et al.
Publicado: (2024)
por: Huang, Yuheng, et al.
Publicado: (2024)
An Analysis of LLM Fine-Tuning and Few-Shot Learning for Flaky Test Detection and Classification
por: More, Riddhi, et al.
Publicado: (2025)
por: More, Riddhi, et al.
Publicado: (2025)
Beyond Greenfield: The D3 Framework for AI-Driven Productivity in Brownfield Engineering
por: Sharma, Krishna Kumaar
Publicado: (2025)
por: Sharma, Krishna Kumaar
Publicado: (2025)
ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation
por: Lim, Soohan, et al.
Publicado: (2025)
por: Lim, Soohan, et al.
Publicado: (2025)
Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets
por: Sakizli, Furkan
Publicado: (2026)
por: Sakizli, Furkan
Publicado: (2026)
VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation
por: Miculicich, Lesly, et al.
Publicado: (2025)
por: Miculicich, Lesly, et al.
Publicado: (2025)
Narrow Transformer: StarCoder-Based Java-LM For Desktop
por: Rathinasamy, Kamalkumar, et al.
Publicado: (2024)
por: Rathinasamy, Kamalkumar, et al.
Publicado: (2024)
A Comparative Study of DSL Code Generation: Fine-Tuning vs. Optimized Retrieval Augmentation
por: Bassamzadeh, Nastaran, et al.
Publicado: (2024)
por: Bassamzadeh, Nastaran, et al.
Publicado: (2024)
Prompt Engineering Strategies for LLM-based Qualitative Coding of Psychological Safety in Software Engineering Communities: A Controlled Empirical Study
por: Alshaikh, Moaath, et al.
Publicado: (2026)
por: Alshaikh, Moaath, et al.
Publicado: (2026)
Developer Challenges on Large Language Models: A Study of Stack Overflow and OpenAI Developer Forum Posts
por: Alam, Khairul, et al.
Publicado: (2024)
por: Alam, Khairul, et al.
Publicado: (2024)
Assessing Data Augmentation-Induced Bias in Training and Testing of Machine Learning Models
por: More, Riddhi, et al.
Publicado: (2025)
por: More, Riddhi, et al.
Publicado: (2025)
RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code
por: Chen, Jiachi, et al.
Publicado: (2024)
por: Chen, Jiachi, et al.
Publicado: (2024)
Collaborative LLM Agents for C4 Software Architecture Design Automation
por: Szczepanik, Kamil, et al.
Publicado: (2025)
por: Szczepanik, Kamil, et al.
Publicado: (2025)
Toward Architecture-Aware Evaluation Metrics for LLM Agents
por: Souza, Débora, et al.
Publicado: (2026)
por: Souza, Débora, et al.
Publicado: (2026)
When Retrieval Hurts Code Completion: A Diagnostic Study of Stale Repository Context
por: Weng, Haojun, et al.
Publicado: (2026)
por: Weng, Haojun, et al.
Publicado: (2026)
Ejemplares similares
-
SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation
por: Chen, Mu-Chi, et al.
Publicado: (2026) -
Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation
por: Karpurapu, Shanthi, et al.
Publicado: (2024) -
Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
por: Le, Nguyen-Khang, et al.
Publicado: (2025) -
Leveraging Large Language Models for Use Case Model Generation from Software Requirements
por: Eisenreich, Tobias, et al.
Publicado: (2025) -
A Systematic Approach for Assessing Large Language Models' Test Case Generation Capability
por: Chang, Hung-Fu, et al.
Publicado: (2025)