Saved in:
Bibliographic Details
Main Authors: Chen, Ziru, White, Michael, Mooney, Raymond, Payani, Ali, Su, Yu, Sun, Huan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.10890
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909217959444480
author Chen, Ziru
White, Michael
Mooney, Raymond
Payani, Ali
Su, Yu
Sun, Huan
author_facet Chen, Ziru
White, Michael
Mooney, Raymond
Payani, Ali
Su, Yu
Sun, Huan
contents In this paper, we examine how large language models (LLMs) solve multi-step problems under a language agent framework with three components: a generator, a discriminator, and a planning method. We investigate the practical utility of two advanced planning methods, iterative correction and tree search. We present a comprehensive analysis of how discrimination accuracy affects the overall performance of agents when using these two methods or a simpler method, re-ranking. Experiments on two tasks, text-to-SQL parsing and mathematical reasoning, show that: (1) advanced planning methods demand discriminators with at least 90% accuracy to achieve significant improvements over re-ranking; (2) current LLMs' discrimination abilities have not met the needs of advanced planning methods to achieve such improvements; (3) with LLM-based discriminators, advanced planning methods may not adequately balance accuracy and efficiency. For example, compared to the other two methods, tree search is at least 10--20 times slower but leads to negligible performance gains, which hinders its real-world applications. Code and data are available at https://github.com/OSU-NLP-Group/llm-planning-eval.
format Preprint
id arxiv_https___arxiv_org_abs_2402_10890
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle When is Tree Search Useful for LLM Planning? It Depends on the Discriminator
Chen, Ziru
White, Michael
Mooney, Raymond
Payani, Ali
Su, Yu
Sun, Huan
Computation and Language
Artificial Intelligence
Machine Learning
In this paper, we examine how large language models (LLMs) solve multi-step problems under a language agent framework with three components: a generator, a discriminator, and a planning method. We investigate the practical utility of two advanced planning methods, iterative correction and tree search. We present a comprehensive analysis of how discrimination accuracy affects the overall performance of agents when using these two methods or a simpler method, re-ranking. Experiments on two tasks, text-to-SQL parsing and mathematical reasoning, show that: (1) advanced planning methods demand discriminators with at least 90% accuracy to achieve significant improvements over re-ranking; (2) current LLMs' discrimination abilities have not met the needs of advanced planning methods to achieve such improvements; (3) with LLM-based discriminators, advanced planning methods may not adequately balance accuracy and efficiency. For example, compared to the other two methods, tree search is at least 10--20 times slower but leads to negligible performance gains, which hinders its real-world applications. Code and data are available at https://github.com/OSU-NLP-Group/llm-planning-eval.
title When is Tree Search Useful for LLM Planning? It Depends on the Discriminator
topic Computation and Language
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2402.10890