:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shah, Faiz Ali, Sabir, Ahmed, Sharma, Rajesh, Pfahl, Dietmar
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Software Engineering
Online Access:	https://arxiv.org/abs/2409.07162
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An LSTM-based Test Selection Method for Self-Driving Cars
by: Güllü, Ali, et al.
Published: (2025)

Leveraging Encoder-only Large Language Models for Mobile App Review Feature Extraction
by: Motger, Quim, et al.
Published: (2024)

Evaluating the impact of code smell refactoring on the energy consumption of Android applications
by: Anwar, Hina, et al.
Published: (2025)

Zero-shot Bilingual App Reviews Mining with Large Language Models
by: Wei, Jialiang, et al.
Published: (2023)

Effective Black Box Testing of Sentiment Analysis Classification Networks
by: Karbasizadeh, Parsa, et al.
Published: (2024)

Model Editing for LLMs4Code: How Far are We?
by: Li, Xiaopeng, et al.
Published: (2024)

Exploring a Test Data-Driven Method for Selecting and Constraining Metamorphic Relations
by: Duque-Torres, Alejandra, et al.
Published: (2023)

Teaching Simulation as a Research Method in Empirical Software Engineering
by: de França, Breno Bernard Nicolau, et al.
Published: (2025)

Leveraging LLMs for Grammar Adaptation: A Study on Metamodel-Grammar Co-Evolution
by: Zhang, Weixing, et al.
Published: (2026)

Beyond Keywords: A Context-based Hybrid Approach to Mining Ethical Concern-related App Reviews
by: Sorathiya, Aakash, et al.
Published: (2024)

StackEval: Benchmarking LLMs in Coding Assistance
by: Shah, Nidhish, et al.
Published: (2024)

Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis
by: Huang, Junjie, et al.
Published: (2024)

A Critical Study of What Code-LLMs (Do Not) Learn
by: Anand, Abhinav, et al.
Published: (2024)

RePair: Automated Program Repair with Process-based Feedback
by: Zhao, Yuze, et al.
Published: (2024)

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
by: Zheng, Zihan, et al.
Published: (2025)

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data
by: Wang, Yejie, et al.
Published: (2024)

Towards Automatic Generation of Amplified Regression Test Oracles
by: Duque-Torres, Alejandra, et al.
Published: (2023)

How Effective are Generative Large Language Models in Performing Requirements Classification?
by: Alhoshan, Waad, et al.
Published: (2025)

A Case Study of Web App Coding with OpenAI Reasoning Models
by: Cui, Yi
Published: (2024)

Do You Understand How I Feel?: Towards Verified Empathy in Therapy Chatbots
by: Dettori, Francesco, et al.
Published: (2026)

Automating Computational Reproducibility in Social Science: Comparing Prompt-Based and Agent-Based Approaches
by: Shah, Syed Mehtab Hussain, et al.
Published: (2026)

Sentiment Analysis in Software Engineering: Evaluating Generative Pre-trained Transformers
by: Saifullah, KM Khalid, et al.
Published: (2025)

Benchmarking LLMs for Unit Test Generation from Real-World Functions
by: Huang, Dong, et al.
Published: (2025)

Multi-Programming Language Sandbox for LLMs
by: Dou, Shihan, et al.
Published: (2024)

SERA: Soft-Verified Efficient Repository Agents
by: Shen, Ethan, et al.
Published: (2026)

EM-Assist: Safe Automated ExtractMethod Refactoring with LLMs
by: Pomian, Dorin, et al.
Published: (2024)

Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval
by: Wu, Jiarong, et al.
Published: (2025)

Evaluation of Code LLMs on Geospatial Code Generation
by: Gramacki, Piotr, et al.
Published: (2024)

LLMs in Mobile Apps: Practices, Challenges, and Opportunities
by: Hau, Kimberly, et al.
Published: (2025)

Towards Extracting Ethical Concerns-related Software Requirements from App Reviews
by: Sorathiya, Aakash, et al.
Published: (2024)

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
by: Shi, Yuling, et al.
Published: (2026)

TDD-Bench Verified: Can LLMs Generate Tests for Issues Before They Get Resolved?
by: Ahmed, Toufique, et al.
Published: (2024)

Log Summarisation for Defect Evolution Analysis
by: Dolga, Rares, et al.
Published: (2024)

MathDuels: Evaluating LLMs as Problem Posers and Solvers
by: Xu, Zhiqiu, et al.
Published: (2026)

DependEval: Benchmarking LLMs for Repository Dependency Understanding
by: Du, Junjia, et al.
Published: (2025)

Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection
by: Li, Yuxi, et al.
Published: (2024)

CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models
by: Lin, Hong Yi, et al.
Published: (2025)

NLPerturbator: Studying the Robustness of Code LLMs to Natural Language Variations
by: Chen, Junkai, et al.
Published: (2024)

Showing LLM-Generated Code Selectively Based on Confidence of LLMs
by: Li, Jia, et al.
Published: (2024)

FairCoder: Evaluating Social Bias of LLMs in Code Generation
by: Du, Yongkang, et al.
Published: (2025)