Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pinto, Gustavo, Naves, Pedro Eduardo de Paula, Camargo, Ana Paula, Silva, Marselle
Format:	Preprint
Published:	2026
Subjects:	Software Engineering
Online Access:	https://arxiv.org/abs/2604.09805
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917400103878656
author	Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle
author_facet	Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle
contents	Enterprise teams building internal coding agents face a gap between prototype performance and production readiness. The root cause is that technical model quality alone is insufficient -- tool design, safety enforcement, state management, and human trust calibration are equally decisive, yet underreported in the literature. We present CodeGen, an internal coding agent at Zup, and show that targeted tool design (e.g., string-replacement edits over full-file rewrites) and layered safety guardrails improved agent reliability more than prompt engineering, while progressive human oversight modes drove organic adoption without mandating trust. These findings suggest that the engineering decisions surrounding the model -- not the model itself -- determine whether a coding agent delivers real value in practice.
format	Preprint
id	arxiv_https___arxiv_org_abs_2604_09805
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Building an Internal Coding Agent at Zup: Lessons and Open Questions Pinto, Gustavo Naves, Pedro Eduardo de Paula Camargo, Ana Paula Silva, Marselle Software Engineering Enterprise teams building internal coding agents face a gap between prototype performance and production readiness. The root cause is that technical model quality alone is insufficient -- tool design, safety enforcement, state management, and human trust calibration are equally decisive, yet underreported in the literature. We present CodeGen, an internal coding agent at Zup, and show that targeted tool design (e.g., string-replacement edits over full-file rewrites) and layered safety guardrails improved agent reliability more than prompt engineering, while progressive human oversight modes drove organic adoption without mandating trust. These findings suggest that the engineering decisions surrounding the model -- not the model itself -- determine whether a coding agent delivers real value in practice.
title	Building an Internal Coding Agent at Zup: Lessons and Open Questions
topic	Software Engineering
url	https://arxiv.org/abs/2604.09805

Similar Items