Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.03018 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866909675581079552 |
|---|---|
| author | Qiu, Zipeng |
| author_facet | Qiu, Zipeng |
| contents | Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2507_03018 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering Qiu, Zipeng Computation and Language Open-domain table question answering traditionally relies on a two-stage pipeline: static table retrieval followed by a closed-domain answer. In contrast, we propose an end-to-end agentic framework that embeds multi-turn tool calls-using a BM25+-based search API and a SQLite SQL executor-directly into a large language model. To further adapt a compact 4B-parameter model, we introduce a two-stage fine-tuning process: supervised cold-start on easy questions, then Async GRPO reinforcement learning on harder cases with LoRA adapters and a rollout buffer. This unified approach enables the model to jointly retrieve, reason, and execute queries, yielding a dramatic accuracy improvement from single-digit zero-shot performance to over 0.86 exact match on a held-out test set. Our results underscore the effectiveness of integrating structured tool calls with targeted RL fine-tuning for scalable, accurate table QA. The code is available at https://github.com/TabibitoQZP/OpenTableR1. |
| title | OpenTable-R1: A Reinforcement Learning Augmented Tool Agent for Open-Domain Table Question Answering |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2507.03018 |