Saved in:
Bibliographic Details
Main Author: Qiu, Zipeng
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.03018
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909675581079552
author Qiu, Zipeng
author_facet Qiu, Zipeng
contents Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1.
format Preprint
id arxiv_https___arxiv_org_abs_2507_03018
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering
Qiu, Zipeng
Computation and Language
Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1.
title OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering
topic Computation and Language
url https://arxiv.org/abs/2507.03018