Saved in:
| Main Authors: | Shah, Faiz Ali, Sabir, Ahmed, Sharma, Rajesh, Pfahl, Dietmar |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.07162 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An LSTM-based Test Selection Method for Self-Driving Cars
by: Güllü, Ali, et al.
Published: (2025)
by: Güllü, Ali, et al.
Published: (2025)
Leveraging Encoder-only Large Language Models for Mobile App Review Feature Extraction
by: Motger, Quim, et al.
Published: (2024)
by: Motger, Quim, et al.
Published: (2024)
Evaluating the impact of code smell refactoring on the energy consumption of Android applications
by: Anwar, Hina, et al.
Published: (2025)
by: Anwar, Hina, et al.
Published: (2025)
Zero-shot Bilingual App Reviews Mining with Large Language Models
by: Wei, Jialiang, et al.
Published: (2023)
by: Wei, Jialiang, et al.
Published: (2023)
Effective Black Box Testing of Sentiment Analysis Classification Networks
by: Karbasizadeh, Parsa, et al.
Published: (2024)
by: Karbasizadeh, Parsa, et al.
Published: (2024)
Model Editing for LLMs4Code: How Far are We?
by: Li, Xiaopeng, et al.
Published: (2024)
by: Li, Xiaopeng, et al.
Published: (2024)
Exploring a Test Data-Driven Method for Selecting and Constraining Metamorphic Relations
by: Duque-Torres, Alejandra, et al.
Published: (2023)
by: Duque-Torres, Alejandra, et al.
Published: (2023)
Teaching Simulation as a Research Method in Empirical Software Engineering
by: de França, Breno Bernard Nicolau, et al.
Published: (2025)
by: de França, Breno Bernard Nicolau, et al.
Published: (2025)
Leveraging LLMs for Grammar Adaptation: A Study on Metamodel-Grammar Co-Evolution
by: Zhang, Weixing, et al.
Published: (2026)
by: Zhang, Weixing, et al.
Published: (2026)
Beyond Keywords: A Context-based Hybrid Approach to Mining Ethical Concern-related App Reviews
by: Sorathiya, Aakash, et al.
Published: (2024)
by: Sorathiya, Aakash, et al.
Published: (2024)
StackEval: Benchmarking LLMs in Coding Assistance
by: Shah, Nidhish, et al.
Published: (2024)
by: Shah, Nidhish, et al.
Published: (2024)
Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis
by: Huang, Junjie, et al.
Published: (2024)
by: Huang, Junjie, et al.
Published: (2024)
A Critical Study of What Code-LLMs (Do Not) Learn
by: Anand, Abhinav, et al.
Published: (2024)
by: Anand, Abhinav, et al.
Published: (2024)
RePair: Automated Program Repair with Process-based Feedback
by: Zhao, Yuze, et al.
Published: (2024)
by: Zhao, Yuze, et al.
Published: (2024)
LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?
by: Zheng, Zihan, et al.
Published: (2025)
by: Zheng, Zihan, et al.
Published: (2025)
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data
by: Wang, Yejie, et al.
Published: (2024)
by: Wang, Yejie, et al.
Published: (2024)
Towards Automatic Generation of Amplified Regression Test Oracles
by: Duque-Torres, Alejandra, et al.
Published: (2023)
by: Duque-Torres, Alejandra, et al.
Published: (2023)
How Effective are Generative Large Language Models in Performing Requirements Classification?
by: Alhoshan, Waad, et al.
Published: (2025)
by: Alhoshan, Waad, et al.
Published: (2025)
A Case Study of Web App Coding with OpenAI Reasoning Models
by: Cui, Yi
Published: (2024)
by: Cui, Yi
Published: (2024)
Do You Understand How I Feel?: Towards Verified Empathy in Therapy Chatbots
by: Dettori, Francesco, et al.
Published: (2026)
by: Dettori, Francesco, et al.
Published: (2026)
Automating Computational Reproducibility in Social Science: Comparing Prompt-Based and Agent-Based Approaches
by: Shah, Syed Mehtab Hussain, et al.
Published: (2026)
by: Shah, Syed Mehtab Hussain, et al.
Published: (2026)
Sentiment Analysis in Software Engineering: Evaluating Generative Pre-trained Transformers
by: Saifullah, KM Khalid, et al.
Published: (2025)
by: Saifullah, KM Khalid, et al.
Published: (2025)
Benchmarking LLMs for Unit Test Generation from Real-World Functions
by: Huang, Dong, et al.
Published: (2025)
by: Huang, Dong, et al.
Published: (2025)
Multi-Programming Language Sandbox for LLMs
by: Dou, Shihan, et al.
Published: (2024)
by: Dou, Shihan, et al.
Published: (2024)
SERA: Soft-Verified Efficient Repository Agents
by: Shen, Ethan, et al.
Published: (2026)
by: Shen, Ethan, et al.
Published: (2026)
EM-Assist: Safe Automated ExtractMethod Refactoring with LLMs
by: Pomian, Dorin, et al.
Published: (2024)
by: Pomian, Dorin, et al.
Published: (2024)
Isolating Language-Coding from Problem-Solving: Benchmarking LLMs with PseudoEval
by: Wu, Jiarong, et al.
Published: (2025)
by: Wu, Jiarong, et al.
Published: (2025)
Evaluation of Code LLMs on Geospatial Code Generation
by: Gramacki, Piotr, et al.
Published: (2024)
by: Gramacki, Piotr, et al.
Published: (2024)
LLMs in Mobile Apps: Practices, Challenges, and Opportunities
by: Hau, Kimberly, et al.
Published: (2025)
by: Hau, Kimberly, et al.
Published: (2025)
Towards Extracting Ethical Concerns-related Software Requirements from App Reviews
by: Sorathiya, Aakash, et al.
Published: (2024)
by: Sorathiya, Aakash, et al.
Published: (2024)
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
by: Shi, Yuling, et al.
Published: (2026)
by: Shi, Yuling, et al.
Published: (2026)
TDD-Bench Verified: Can LLMs Generate Tests for Issues Before They Get Resolved?
by: Ahmed, Toufique, et al.
Published: (2024)
by: Ahmed, Toufique, et al.
Published: (2024)
Log Summarisation for Defect Evolution Analysis
by: Dolga, Rares, et al.
Published: (2024)
by: Dolga, Rares, et al.
Published: (2024)
MathDuels: Evaluating LLMs as Problem Posers and Solvers
by: Xu, Zhiqiu, et al.
Published: (2026)
by: Xu, Zhiqiu, et al.
Published: (2026)
DependEval: Benchmarking LLMs for Repository Dependency Understanding
by: Du, Junjia, et al.
Published: (2025)
by: Du, Junjia, et al.
Published: (2025)
Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection
by: Li, Yuxi, et al.
Published: (2024)
by: Li, Yuxi, et al.
Published: (2024)
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models
by: Lin, Hong Yi, et al.
Published: (2025)
by: Lin, Hong Yi, et al.
Published: (2025)
NLPerturbator: Studying the Robustness of Code LLMs to Natural Language Variations
by: Chen, Junkai, et al.
Published: (2024)
by: Chen, Junkai, et al.
Published: (2024)
Showing LLM-Generated Code Selectively Based on Confidence of LLMs
by: Li, Jia, et al.
Published: (2024)
by: Li, Jia, et al.
Published: (2024)
FairCoder: Evaluating Social Bias of LLMs in Code Generation
by: Du, Yongkang, et al.
Published: (2025)
by: Du, Yongkang, et al.
Published: (2025)
Similar Items
-
An LSTM-based Test Selection Method for Self-Driving Cars
by: Güllü, Ali, et al.
Published: (2025) -
Leveraging Encoder-only Large Language Models for Mobile App Review Feature Extraction
by: Motger, Quim, et al.
Published: (2024) -
Evaluating the impact of code smell refactoring on the energy consumption of Android applications
by: Anwar, Hina, et al.
Published: (2025) -
Zero-shot Bilingual App Reviews Mining with Large Language Models
by: Wei, Jialiang, et al.
Published: (2023) -
Effective Black Box Testing of Sentiment Analysis Classification Networks
by: Karbasizadeh, Parsa, et al.
Published: (2024)