Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.17627 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917221190598656 |
|---|---|
| author | Pham, Dung Ghaleb, Taher A. |
| author_facet | Pham, Dung Ghaleb, Taher A. |
| contents | AI coding agents can autonomously generate pull requests (PRs), yet little is known about how their contributions compare to those of humans. We analyze 33,596 agent-generated PRs (APRs) and 6,618 human PRs (HPRs) to compare code-change characteristics and message quality. We observe that APR-introduced symbols (functions and classes) are removed much sooner than those in HPRs (median time to removal 3 vs. 34 days) and are also removed more often (symbol churn 7.33% vs. 4.10%), reflecting a focus on other tasks like documentation and test updates. Agents generate stronger commit-level messages (semantic similarity 0.72 vs. 0.68) but lag humans at PR-level summarization (PR-commit similarity 0.86 vs. 0.88). Commit message length is the best predictor of description quality, indicating reliance on individual commits over full-PR reasoning. These findings highlight a gap between agents' micro-level precision and macro-level communication, suggesting opportunities to improve agent-driven development workflows. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2601_17627 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Code Change Characteristics and Description Alignment: A Comparative Study of Agentic versus Human Pull Requests Pham, Dung Ghaleb, Taher A. Software Engineering AI coding agents can autonomously generate pull requests (PRs), yet little is known about how their contributions compare to those of humans. We analyze 33,596 agent-generated PRs (APRs) and 6,618 human PRs (HPRs) to compare code-change characteristics and message quality. We observe that APR-introduced symbols (functions and classes) are removed much sooner than those in HPRs (median time to removal 3 vs. 34 days) and are also removed more often (symbol churn 7.33% vs. 4.10%), reflecting a focus on other tasks like documentation and test updates. Agents generate stronger commit-level messages (semantic similarity 0.72 vs. 0.68) but lag humans at PR-level summarization (PR-commit similarity 0.86 vs. 0.88). Commit message length is the best predictor of description quality, indicating reliance on individual commits over full-PR reasoning. These findings highlight a gap between agents' micro-level precision and macro-level communication, suggesting opportunities to improve agent-driven development workflows. |
| title | Code Change Characteristics and Description Alignment: A Comparative Study of Agentic versus Human Pull Requests |
| topic | Software Engineering |
| url | https://arxiv.org/abs/2601.17627 |