Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Qiu, Zipeng
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2507.03018
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909675581079552
author	Qiu, Zipeng
author_facet	Qiu, Zipeng
contents	Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_03018
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering Qiu, Zipeng Computation and Language Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1.
title	OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering
topic	Computation and Language
url	https://arxiv.org/abs/2507.03018

Similar Items