Saved in:
Bibliographic Details
Main Authors: Xu, Zekun, Xia, Siyu, Yue, Chuhuai, Chai, Jiajun, Tian, Mingxue, Wang, Xiaohan, Lin, Wei, Li, Haoxuan, Yin, Guojun
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.25510
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918177832697856
author Xu, Zekun
Xia, Siyu
Yue, Chuhuai
Chai, Jiajun
Tian, Mingxue
Wang, Xiaohan
Lin, Wei
Li, Haoxuan
Yin, Guojun
author_facet Xu, Zekun
Xia, Siyu
Yue, Chuhuai
Chai, Jiajun
Tian, Mingxue
Wang, Xiaohan
Lin, Wei
Li, Haoxuan
Yin, Guojun
contents As large language models (LLMs) are increasingly used in Text-to-SQL tasks, Reinforcement Learning (RL) has become a common method for improving performance. Existing methods primarily rely on static execution feedback, which restricts real-time error correction. However, integrating multi-turn tool invocation along with dynamic feedback could significantly improve adaptability and robustness, ultimately enhancing model performance. To address these issues, we propose MTIR-SQL, an innovative Multi-turn Tool-Integrated Reasoning reinforcement learning framework for Text-to-SQL. Our approach introduces an execution-aware multi-turn reasoning paradigm that seamlessly incorporates database execution feedback at each reasoning step, enabling context-sensitive query generation and progressive refinement throughout the reasoning process. The framework extends the GRPO algorithm to accommodate complex multi-turn interaction scenarios. Considering the training instability characteristics of MTIR and the potential for significant Deviation of model distribution from the initial model, we enhance the GRPO algorithm by adding a trajectory filtering mechanism and removing KL loss constraints. Experimental results demonstrate that MTIR-SQL, with 4B parameters, achieves \textbf{64.4}\% accuracy in the BIRD Dev and 84.6% execution accuracy in the SPIDER Dev, significantly outperforming existing approaches.
format Preprint
id arxiv_https___arxiv_org_abs_2510_25510
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle MTIR-SQL: Multi-turn Tool-Integrated Reasoning Reinforcement Learning for Text-to-SQL
Xu, Zekun
Xia, Siyu
Yue, Chuhuai
Chai, Jiajun
Tian, Mingxue
Wang, Xiaohan
Lin, Wei
Li, Haoxuan
Yin, Guojun
Artificial Intelligence
As large language models (LLMs) are increasingly used in Text-to-SQL tasks, Reinforcement Learning (RL) has become a common method for improving performance. Existing methods primarily rely on static execution feedback, which restricts real-time error correction. However, integrating multi-turn tool invocation along with dynamic feedback could significantly improve adaptability and robustness, ultimately enhancing model performance. To address these issues, we propose MTIR-SQL, an innovative Multi-turn Tool-Integrated Reasoning reinforcement learning framework for Text-to-SQL. Our approach introduces an execution-aware multi-turn reasoning paradigm that seamlessly incorporates database execution feedback at each reasoning step, enabling context-sensitive query generation and progressive refinement throughout the reasoning process. The framework extends the GRPO algorithm to accommodate complex multi-turn interaction scenarios. Considering the training instability characteristics of MTIR and the potential for significant Deviation of model distribution from the initial model, we enhance the GRPO algorithm by adding a trajectory filtering mechanism and removing KL loss constraints. Experimental results demonstrate that MTIR-SQL, with 4B parameters, achieves \textbf{64.4}\% accuracy in the BIRD Dev and 84.6% execution accuracy in the SPIDER Dev, significantly outperforming existing approaches.
title MTIR-SQL: Multi-turn Tool-Integrated Reasoning Reinforcement Learning for Text-to-SQL
topic Artificial Intelligence
url https://arxiv.org/abs/2510.25510