Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.09805 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917400103878656 |
|---|---|
| author | Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle |
| author_facet | Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle |
| contents | Enterprise teams building internal coding agents face a gap between prototype performance and production readiness. The root cause is that technical model quality alone is insufficient -- tool design, safety enforcement, state management, and human trust calibration are equally decisive, yet underreported in the literature. We present CodeGen, an internal coding agent at Zup, and show that targeted tool design (e.g., string-replacement edits over full-file rewrites) and layered safety guardrails improved agent reliability more than prompt engineering, while progressive human oversight modes drove organic adoption without mandating trust. These findings suggest that the engineering decisions surrounding the model -- not the model itself -- determine whether a coding agent delivers real value in practice. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_09805 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Building an Internal Coding Agent at Zup: Lessons and Open Questions Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle Software Engineering Enterprise teams building internal coding agents face a gap between prototype performance and production readiness. The root cause is that technical model quality alone is insufficient -- tool design, safety enforcement, state management, and human trust calibration are equally decisive, yet underreported in the literature. We present CodeGen, an internal coding agent at Zup, and show that targeted tool design (e.g., string-replacement edits over full-file rewrites) and layered safety guardrails improved agent reliability more than prompt engineering, while progressive human oversight modes drove organic adoption without mandating trust. These findings suggest that the engineering decisions surrounding the model -- not the model itself -- determine whether a coding agent delivers real value in practice. |
| title | Building an Internal Coding Agent at Zup: Lessons and Open Questions |
| topic | Software Engineering |
| url | https://arxiv.org/abs/2604.09805 |