Saved in:
Bibliographic Details
Main Authors: Jivrajani, Madhav, Alagappan, Ramnatthan, Ganesan, Aishwarya
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.30862
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913172203503616
author Jivrajani, Madhav
Alagappan, Ramnatthan
Ganesan, Aishwarya
author_facet Jivrajani, Madhav
Alagappan, Ramnatthan
Ganesan, Aishwarya
contents Text2SQL agents powered by LLMs translate natural language intent into SQL by exploring the data system through tool calls before formulating the query. However, to ensure secure and scoped access, data systems construct environments with explicit API surfaces. We study and categorize these APIs exposed today as either coarse-grained or fine-grained and posit that choosing between them presents a fundamental tradeoff between cost-efficient exploration and accurate SQL generation. Most data systems expose fine-grained APIs, but this inadvertently disadvantages agents: they over-explore, incorporating irrelevant schema elements into their query formulation and produce inaccurate results. We argue that curbing over-exploration is key to the effective use of these API surfaces, and propose Sophrosyne, a data system environment that augments API responses with directives that guide the agent's exploration process. Initial results show that directives reduce over-exploration by 4.6x and boost accuracy by up to 12.4% (approx. 4 percentage points).
format Preprint
id arxiv_https___arxiv_org_abs_2605_30862
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation
Jivrajani, Madhav
Alagappan, Ramnatthan
Ganesan, Aishwarya
Databases
Artificial Intelligence
Text2SQL agents powered by LLMs translate natural language intent into SQL by exploring the data system through tool calls before formulating the query. However, to ensure secure and scoped access, data systems construct environments with explicit API surfaces. We study and categorize these APIs exposed today as either coarse-grained or fine-grained and posit that choosing between them presents a fundamental tradeoff between cost-efficient exploration and accurate SQL generation. Most data systems expose fine-grained APIs, but this inadvertently disadvantages agents: they over-explore, incorporating irrelevant schema elements into their query formulation and produce inaccurate results. We argue that curbing over-exploration is key to the effective use of these API surfaces, and propose Sophrosyne, a data system environment that augments API responses with directives that guide the agent's exploration process. Initial results show that directives reduce over-exploration by 4.6x and boost accuracy by up to 12.4% (approx. 4 percentage points).
title Sophrosyne: Agentic Exploration of Relational Data Systems Needs Moderation
topic Databases
Artificial Intelligence
url https://arxiv.org/abs/2605.30862