Inhaltsangabe: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Sarwar, Tabinda, Moghimifar, Farhad, Hoang, Cong Duy Vu, Ma, Xiaoxiao, Xu, Shawn Chang, Saleh, Fahimeh, Zaremoodi, Poorya, Sil, Avirup, Kirchhoff, Katrin
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2604.22313
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

NL2SQL systems deployed in industry settings often encounter ambiguous or unanswerable queries, particularly in interactive scenarios with incomplete user clarification. Existing benchmarks typically assume a single source of ambiguity and rely on user interaction for resolution, overlooking realistic failure modes. We introduce Clarity, a framework for automatically generating an NL2SQL benchmark with multi-faceted ambiguities and diverse user behaviors across both single- and multi-turn settings. Using a constraint-driven pipeline, Clarity transforms executable SQL into ambiguous queries, augmented with grounded conversational continuations and schema-level metadata. Empirical evaluation on Spider and BIRD shows that leading NL2SQL systems, including those based on strong LLMs, suffer significant performance degradation under multi-faceted ambiguity. While these systems often detect ambiguity, they struggle to accurately localize and resolve the underlying schema-level sources. Our results highlight the need for more robust ambiguity detection and resolution in industry-grade NL2SQL systems.

Ähnliche Einträge