Saved in:
| Main Authors: | Caglayan, Bora, Wang, Mingxue, Kelleher, John D., Fei, Shen, Tong, Gui, Ding, Jiandong, Zhang, Puchao |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.22925 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating NL2SQL via SQL2NL
by: Safarzadeh, Mohammadtaher, et al.
Published: (2025)
by: Safarzadeh, Mohammadtaher, et al.
Published: (2025)
ROSE: An Intent-Centered Evaluation Metric for NL2SQL
by: Pei, Wenqi, et al.
Published: (2026)
by: Pei, Wenqi, et al.
Published: (2026)
Blar-SQL: Faster, Stronger, Smaller NL2SQL
by: Domínguez, José Manuel, et al.
Published: (2024)
by: Domínguez, José Manuel, et al.
Published: (2024)
NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions
by: Hou, Shizheng, et al.
Published: (2026)
by: Hou, Shizheng, et al.
Published: (2026)
SPENCE: A Syntactic Probe for Detecting Contamination in NL2SQL Benchmarks
by: Safarzadeh, Mohammadtaher, et al.
Published: (2026)
by: Safarzadeh, Mohammadtaher, et al.
Published: (2026)
Agentic NL2SQL to Reduce Computational Costs
by: Jehle, Dominik, et al.
Published: (2025)
by: Jehle, Dominik, et al.
Published: (2025)
SHREC: a SRE Behaviour Knowledge Graph Model for Shell Command Recommendations
by: Tonon, Andrea, et al.
Published: (2024)
by: Tonon, Andrea, et al.
Published: (2024)
Memo-SQL: Structured Decomposition and Experience-Driven Self-Correction for Training-Free NL2SQL
by: Yang, Zerui, et al.
Published: (2026)
by: Yang, Zerui, et al.
Published: (2026)
GeoSQL-Eval: First Evaluation of LLMs on PostGIS-Based NL2GeoSQL Queries
by: Hou, Shuyang, et al.
Published: (2025)
by: Hou, Shuyang, et al.
Published: (2025)
VeriMinder: Mitigating Analytical Vulnerabilities in NL2SQL
by: Mohole, Shubham, et al.
Published: (2025)
by: Mohole, Shubham, et al.
Published: (2025)
OraPlan-SQL: A Planning-Centric Framework for Complex Bilingual NL2SQL Reasoning
by: Liu, Marianne Menglin, et al.
Published: (2025)
by: Liu, Marianne Menglin, et al.
Published: (2025)
SQLong: Enhanced NL2SQL for Longer Contexts with LLMs
by: Nguyen, Dai Quoc, et al.
Published: (2025)
by: Nguyen, Dai Quoc, et al.
Published: (2025)
Distill-C: Enhanced NL2SQL via Distilled Customization with LLMs
by: Hoang, Cong Duy Vu, et al.
Published: (2025)
by: Hoang, Cong Duy Vu, et al.
Published: (2025)
MTIR-SQL: Multi-turn Tool-Integrated Reasoning Reinforcement Learning for Text-to-SQL
by: Xu, Zekun, et al.
Published: (2025)
by: Xu, Zekun, et al.
Published: (2025)
Optimizing Small Language Models for NL2SQL via Chain-of-Thought Fine-Tuning
by: Solanki, Anshul, et al.
Published: (2026)
by: Solanki, Anshul, et al.
Published: (2026)
Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models
by: Pei, Wenqi, et al.
Published: (2025)
by: Pei, Wenqi, et al.
Published: (2025)
RubikSQL: Lifelong Learning Agentic Knowledge Base as an Industrial NL2SQL System
by: Chen, Zui, et al.
Published: (2025)
by: Chen, Zui, et al.
Published: (2025)
Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL
by: Chung, Yeounoh, et al.
Published: (2025)
by: Chung, Yeounoh, et al.
Published: (2025)
Fact-Consistency Evaluation of Text-to-SQL Generation for Business Intelligence Using Exaone 3.5
by: Choi, Jeho
Published: (2025)
by: Choi, Jeho
Published: (2025)
Dial: A Knowledge-Grounded Dialect-Specific NL2SQL System
by: Zhang, Xiang, et al.
Published: (2026)
by: Zhang, Xiang, et al.
Published: (2026)
LLM NL2SQL Robustness: Surface Noise vs. Linguistic Variation in Traditional and Agentic Settings
by: Tu, Lifu, et al.
Published: (2026)
by: Tu, Lifu, et al.
Published: (2026)
LR-SQL: A Supervised Fine-Tuning Method for Text2SQL Tasks under Low-Resource Scenarios
by: Wuzhenghong, Wen, et al.
Published: (2024)
by: Wuzhenghong, Wen, et al.
Published: (2024)
RedParrot: Accelerating NL-to-DSL for Business Analytics via Query Semantic Caching
by: Wang, Tong, et al.
Published: (2026)
by: Wang, Tong, et al.
Published: (2026)
A framework for measuring the training efficiency of a neural architecture
by: Cueto-Mendoza, Eduardo, et al.
Published: (2024)
by: Cueto-Mendoza, Eduardo, et al.
Published: (2024)
Pre-Hoc Predictions in AutoML: Leveraging LLMs to Enhance Model Selection and Benchmarking for Tabular datasets
by: Belkhiter, Yannis, et al.
Published: (2025)
by: Belkhiter, Yannis, et al.
Published: (2025)
TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring
by: Lee, Gyubok, et al.
Published: (2024)
by: Lee, Gyubok, et al.
Published: (2024)
MermaidSeqBench: An Evaluation Benchmark for NL-to-Mermaid Sequence Diagram Generation
by: Shbita, Basel, et al.
Published: (2025)
by: Shbita, Basel, et al.
Published: (2025)
Evaluating Tokenizer Performance of Large Language Models Across Official Indian Languages
by: Tamang, S., et al.
Published: (2024)
by: Tamang, S., et al.
Published: (2024)
ScenicNL: Generating Probabilistic Scenario Programs from Natural Language
by: Elmaaroufi, Karim, et al.
Published: (2024)
by: Elmaaroufi, Karim, et al.
Published: (2024)
NL2SQL-BUGs: A Benchmark for Detecting Semantic Errors in NL2SQL Translation
by: Liu, Xinyu, et al.
Published: (2025)
by: Liu, Xinyu, et al.
Published: (2025)
Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems
by: Arif, Taslim Jamal, et al.
Published: (2026)
by: Arif, Taslim Jamal, et al.
Published: (2026)
Hybrid-NL2SVA: Integrating RAG and Finetuning for LLM-based NL2SVA
by: Xiao, Weihua, et al.
Published: (2025)
by: Xiao, Weihua, et al.
Published: (2025)
Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload
by: Ma, Limin, et al.
Published: (2024)
by: Ma, Limin, et al.
Published: (2024)
Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous Evaluation
by: Kang, Jungkoo
Published: (2025)
by: Kang, Jungkoo
Published: (2025)
SelECT-SQL: Self-correcting ensemble Chain-of-Thought for Text-to-SQL
by: Shen, Ke, et al.
Published: (2024)
by: Shen, Ke, et al.
Published: (2024)
Falcon: A Comprehensive Chinese Text-to-SQL Benchmark for Enterprise-Grade Evaluation
by: Luo, Wenzhen, et al.
Published: (2025)
by: Luo, Wenzhen, et al.
Published: (2025)
Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving
by: Li, Jingzheng, et al.
Published: (2025)
by: Li, Jingzheng, et al.
Published: (2025)
LocalSearchBench: Benchmarking Agentic Search in Real-World Local Life Services
by: He, Hang, et al.
Published: (2025)
by: He, Hang, et al.
Published: (2025)
BEAVER: An Enterprise Benchmark for Text-to-SQL
by: Chen, Peter Baile, et al.
Published: (2024)
by: Chen, Peter Baile, et al.
Published: (2024)
RESTestBench: A Benchmark for Evaluating the Effectiveness of LLM-Generated REST API Test Cases from NL Requirements
by: Kogler, Leon, et al.
Published: (2026)
by: Kogler, Leon, et al.
Published: (2026)
Similar Items
-
Evaluating NL2SQL via SQL2NL
by: Safarzadeh, Mohammadtaher, et al.
Published: (2025) -
ROSE: An Intent-Centered Evaluation Metric for NL2SQL
by: Pei, Wenqi, et al.
Published: (2026) -
Blar-SQL: Faster, Stronger, Smaller NL2SQL
by: Domínguez, José Manuel, et al.
Published: (2024) -
NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions
by: Hou, Shizheng, et al.
Published: (2026) -
SPENCE: A Syntactic Probe for Detecting Contamination in NL2SQL Benchmarks
by: Safarzadeh, Mohammadtaher, et al.
Published: (2026)