Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04430 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915779586293760 |
|---|---|
| author | Sanmartino, Gabriele Urban, Matthias Papotti, Paolo Binnig, Carsten |
| author_facet | Sanmartino, Gabriele Urban, Matthias Papotti, Paolo Binnig, Carsten |
| contents | LLM-augmented data systems enable semantic querying over structured and unstructured data, but executing queries with LLM-powered operators introduces a fundamental runtime-accuracy trade-off. In this paper, we present Stretto, a new execution engine that provides end-to-end query guarantees while efficiently navigating this trade-off in a holistic manner. For this, Stretto formulates query planning as a constrained optimization problem and uses a gradient-based optimizer to jointly select operator implementations and allocate error budgets across pipelines. Moreover, to enable fine-grained execution choices, Stretto introduces a novel idea on how KV-caching can be used to realize a spectrum of different physical operators that transform a sparse design space into a dense continuum of runtime-accuracy trade-offs. Experiments show that Stretto outperforms state-of-the-art systems while consistently meeting quality guarantees. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_04430 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | The Stretto Execution Engine for LLM-Augmented Data Systems Sanmartino, Gabriele Urban, Matthias Papotti, Paolo Binnig, Carsten Databases LLM-augmented data systems enable semantic querying over structured and unstructured data, but executing queries with LLM-powered operators introduces a fundamental runtime-accuracy trade-off. In this paper, we present Stretto, a new execution engine that provides end-to-end query guarantees while efficiently navigating this trade-off in a holistic manner. For this, Stretto formulates query planning as a constrained optimization problem and uses a gradient-based optimizer to jointly select operator implementations and allocate error budgets across pipelines. Moreover, to enable fine-grained execution choices, Stretto introduces a novel idea on how KV-caching can be used to realize a spectrum of different physical operators that transform a sparse design space into a dense continuum of runtime-accuracy trade-offs. Experiments show that Stretto outperforms state-of-the-art systems while consistently meeting quality guarantees. |
| title | The Stretto Execution Engine for LLM-Augmented Data Systems |
| topic | Databases |
| url | https://arxiv.org/abs/2602.04430 |