Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.03310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915719509180416 |
|---|---|
| author | Kniazev, Evgenii Kravchenko, Arseny Rekun, Igor Broadhead, James Shamgunov, Nikita Sah, Pranav Nichite, Pratik Yamshchikov, Ivan |
| author_facet | Kniazev, Evgenii Kravchenko, Arseny Rekun, Igor Broadhead, James Shamgunov, Nikita Sah, Pranav Nichite, Pratik Yamshchikov, Ivan |
| contents | We present app.build (https://github.com/neondatabase/appdotbuild-agent), an open-source framework that improves LLM-based application generation through systematic validation and structured environments. Our approach combines multi-layered validation pipelines, stack-specific orchestration, and model-agnostic architecture, implemented across three reference stacks. Through evaluation on 30 generation tasks, we demonstrate that comprehensive validation achieves 73.3% viability rate with 30% reaching perfect quality scores, while open-weights models achieve 80.8% of closed-model performance when provided structured environments. The open-source framework has been adopted by the community, with over 3,000 applications generated to date. This work demonstrates that scaling reliable AI agents requires scaling environments, not just models -- providing empirical insights and complete reference implementations for production-oriented agent systems. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2509_03310 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | app.build: A Production Framework for Scaling Agentic Prompt-to-App Generation with Environment Scaffolding Kniazev, Evgenii Kravchenko, Arseny Rekun, Igor Broadhead, James Shamgunov, Nikita Sah, Pranav Nichite, Pratik Yamshchikov, Ivan Artificial Intelligence Software Engineering We present app.build (https://github.com/neondatabase/appdotbuild-agent), an open-source framework that improves LLM-based application generation through systematic validation and structured environments. Our approach combines multi-layered validation pipelines, stack-specific orchestration, and model-agnostic architecture, implemented across three reference stacks. Through evaluation on 30 generation tasks, we demonstrate that comprehensive validation achieves 73.3% viability rate with 30% reaching perfect quality scores, while open-weights models achieve 80.8% of closed-model performance when provided structured environments. The open-source framework has been adopted by the community, with over 3,000 applications generated to date. This work demonstrates that scaling reliable AI agents requires scaling environments, not just models -- providing empirical insights and complete reference implementations for production-oriented agent systems. |
| title | app.build: A Production Framework for Scaling Agentic Prompt-to-App Generation with Environment Scaffolding |
| topic | Artificial Intelligence Software Engineering |
| url | https://arxiv.org/abs/2509.03310 |