Saved in:
| Main Authors: | Tian, Zhao, Shu, Honglin, Wang, Dong, Cao, Xuejie, Kamei, Yasutaka, Chen, Junjie |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.01760 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Leveraging Language Models for Log Statement Generation in Multilingual Scenarios: How Far Are We?
by: Kusama, Kazuki, et al.
Published: (2026)
by: Kusama, Kazuki, et al.
Published: (2026)
A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection
by: Yu, Junji, et al.
Published: (2025)
by: Yu, Junji, et al.
Published: (2025)
Evaluating Large Language Models for Multilingual Vulnerability Detection at Dual Granularities
by: Shu, Honglin, et al.
Published: (2025)
by: Shu, Honglin, et al.
Published: (2025)
On the Evaluation of Large Language Models in Multilingual Vulnerability Repair
by: wang, Dong, et al.
Published: (2025)
by: wang, Dong, et al.
Published: (2025)
How Small is Enough? Empirical Evidence of Quantized Small Language Models for Automated Program Repair
by: Kusama, Kazuki, et al.
Published: (2025)
by: Kusama, Kazuki, et al.
Published: (2025)
How Far Have LLMs Come Toward Automated SATD Taxonomy Construction?
by: Nakashima, Sota, et al.
Published: (2025)
by: Nakashima, Sota, et al.
Published: (2025)
"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution
by: Tian, Zhao, et al.
Published: (2026)
by: Tian, Zhao, et al.
Published: (2026)
Vulnerability Detection with Code Language Models: How Far Are We?
by: Ding, Yangruibo, et al.
Published: (2024)
by: Ding, Yangruibo, et al.
Published: (2024)
Duplicate Bug Report Detection: How Far Are We?
by: Zhang, Ting, et al.
Published: (2022)
by: Zhang, Ting, et al.
Published: (2022)
A Large-Scale Evaluation for Log Parsing Techniques: How Far Are We?
by: Jiang, Zhihan, et al.
Published: (2023)
by: Jiang, Zhihan, et al.
Published: (2023)
Specification-Driven Code Translation Powered by Large Language Models: How Far Are We?
by: Saha, Soumit Kanti, et al.
Published: (2024)
by: Saha, Soumit Kanti, et al.
Published: (2024)
Unraveling the Potential of Large Language Models in Code Translation: How Far Are We?
by: Tao, Qingxiao, et al.
Published: (2024)
by: Tao, Qingxiao, et al.
Published: (2024)
Vulnerability-Affected Versions Identification: How Far Are We?
by: Chen, Xingchu, et al.
Published: (2025)
by: Chen, Xingchu, et al.
Published: (2025)
Aligning Requirement for Large Language Model's Code Generation
by: Tian, Zhao, et al.
Published: (2025)
by: Tian, Zhao, et al.
Published: (2025)
When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?
by: Chen, Chong, et al.
Published: (2023)
by: Chen, Chong, et al.
Published: (2023)
Cross-Project Flakiness: A Case Study of the OpenStack Ecosystem
by: Xiao, Tao, et al.
Published: (2026)
by: Xiao, Tao, et al.
Published: (2026)
Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot
by: Koyanagi, Kei, et al.
Published: (2024)
by: Koyanagi, Kei, et al.
Published: (2024)
How Far Have We Gone in Binary Code Understanding Using Large Language Models
by: Shang, Xiuwei, et al.
Published: (2024)
by: Shang, Xiuwei, et al.
Published: (2024)
Deep Learning Framework Testing via Model Mutation: How Far Are We?
by: Mu, Yanzhou, et al.
Published: (2025)
by: Mu, Yanzhou, et al.
Published: (2025)
Fixing Large Language Models' Specification Misunderstanding for Better Code Generation
by: Tian, Zhao, et al.
Published: (2023)
by: Tian, Zhao, et al.
Published: (2023)
An Empirical Evaluation of Manually Created Equivalent Mutants
by: Straubinger, Philipp, et al.
Published: (2024)
by: Straubinger, Philipp, et al.
Published: (2024)
Large-Scale Empirical Analysis of Continuous Fuzzing: Insights from 1 Million Fuzzing Sessions
by: Shirai, Tatsuya, et al.
Published: (2025)
by: Shirai, Tatsuya, et al.
Published: (2025)
Model Editing for LLMs4Code: How Far are We?
by: Li, Xiaopeng, et al.
Published: (2024)
by: Li, Xiaopeng, et al.
Published: (2024)
An Empirical Study on Automatically Detecting AI-Generated Source Code: How Far Are We?
by: Suh, Hyunjae, et al.
Published: (2024)
by: Suh, Hyunjae, et al.
Published: (2024)
Why Agentic-PRs Get Rejected: A Comparative Study of Coding Agents
by: Nakashima, Sota, et al.
Published: (2026)
by: Nakashima, Sota, et al.
Published: (2026)
Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We?
by: Li, Kaixuan, et al.
Published: (2024)
by: Li, Kaixuan, et al.
Published: (2024)
Automatically Detecting Checked-In Secrets in Android Apps: How Far Are We?
by: Li, Kevin, et al.
Published: (2024)
by: Li, Kevin, et al.
Published: (2024)
Can AI Agents Generate Microservices? How Far are We?
by: Adnan, Bassam, et al.
Published: (2026)
by: Adnan, Bassam, et al.
Published: (2026)
Representation Learning for Stack Overflow Posts: How Far are We?
by: He, Junda, et al.
Published: (2023)
by: He, Junda, et al.
Published: (2023)
Automated Testing of Task-based Chatbots: How Far Are We?
by: Clerissi, Diego, et al.
Published: (2026)
by: Clerissi, Diego, et al.
Published: (2026)
Retrieval-Augmented Test Generation: How Far Are We?
by: Shin, Jiho, et al.
Published: (2024)
by: Shin, Jiho, et al.
Published: (2024)
An Empirical Study of Token-based Micro Commits
by: Kondo, Masanari, et al.
Published: (2024)
by: Kondo, Masanari, et al.
Published: (2024)
An empirical study on declined proposals: why are these proposals declined?
by: Kondo, Masanari, et al.
Published: (2025)
by: Kondo, Masanari, et al.
Published: (2025)
Measuring Emergent Capabilities of LLMs for Software Engineering: How Far Are We?
by: O'Brien, Conor, et al.
Published: (2024)
by: O'Brien, Conor, et al.
Published: (2024)
OSS Myths and Facts
by: Iimura, Yukako, et al.
Published: (2024)
by: Iimura, Yukako, et al.
Published: (2024)
Myth: The loss of core developers is a critical issue for OSS communities
by: Nourry, Olivier, et al.
Published: (2024)
by: Nourry, Olivier, et al.
Published: (2024)
Exploring the Potential of Large Language Models in Simulink-Stateflow Mutant Generation
by: Valle, Pablo, et al.
Published: (2026)
by: Valle, Pablo, et al.
Published: (2026)
How Far Can We Go with Practical Function-Level Program Repair?
by: Xiang, Jiahong, et al.
Published: (2024)
by: Xiang, Jiahong, et al.
Published: (2024)
Reasoning Runtime Behavior of a Program with LLM: How Far Are We?
by: Chen, Junkai, et al.
Published: (2024)
by: Chen, Junkai, et al.
Published: (2024)
Using LLMs for Security Advisory Investigations: How Far Are We?
by: Abdullah, Bayu Fedra, et al.
Published: (2025)
by: Abdullah, Bayu Fedra, et al.
Published: (2025)
Similar Items
-
Leveraging Language Models for Log Statement Generation in Multilingual Scenarios: How Far Are We?
by: Kusama, Kazuki, et al.
Published: (2026) -
A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection
by: Yu, Junji, et al.
Published: (2025) -
Evaluating Large Language Models for Multilingual Vulnerability Detection at Dual Granularities
by: Shu, Honglin, et al.
Published: (2025) -
On the Evaluation of Large Language Models in Multilingual Vulnerability Repair
by: wang, Dong, et al.
Published: (2025) -
How Small is Enough? Empirical Evidence of Quantized Small Language Models for Automated Program Repair
by: Kusama, Kazuki, et al.
Published: (2025)