Guardado en:
Detalles Bibliográficos
Autores principales: Cheng, Pengyu, Hu, Tianhao, Xu, Han, Zhang, Zhisong, Yuan, Zheng, Dai, Yong, Han, Lei, Du, Nan, Li, Xiaolong
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2404.10642
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866912201623732224
author Cheng, Pengyu
Hu, Tianhao
Xu, Han
Zhang, Zhisong
Yuan, Zheng
Dai, Yong
Han, Lei
Du, Nan
Li, Xiaolong
author_facet Cheng, Pengyu
Hu, Tianhao
Xu, Han
Zhang, Zhisong
Yuan, Zheng
Dai, Yong
Han, Lei
Du, Nan
Li, Xiaolong
contents We explore the potential of self-play training for large language models (LLMs) in a two-player adversarial language game called Adversarial Taboo. In this game, an attacker and a defender communicate around a target word only visible to the attacker. The attacker aims to induce the defender to speak the target word unconsciously, while the defender tries to infer the target word from the attacker's utterances. To win the game, both players must have sufficient knowledge about the target word and high-level reasoning ability to infer and express in this information-reserved conversation. Hence, we are curious about whether LLMs' reasoning ability can be further enhanced by Self-Playing this Adversarial language Game (SPAG). With this goal, we select several open-source LLMs and let each act as the attacker and play with a copy of itself as the defender on an extensive range of target words. Through reinforcement learning on the game outcomes, we observe that the LLMs' performances uniformly improve on a broad range of reasoning benchmarks. Furthermore, iteratively adopting this self-play process can continuously promote LLMs' reasoning abilities. The code is available at https://github.com/Linear95/SPAG.
format Preprint
id arxiv_https___arxiv_org_abs_2404_10642
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Self-playing Adversarial Language Game Enhances LLM Reasoning
Cheng, Pengyu
Hu, Tianhao
Xu, Han
Zhang, Zhisong
Yuan, Zheng
Dai, Yong
Han, Lei
Du, Nan
Li, Xiaolong
Computation and Language
Machine Learning
We explore the potential of self-play training for large language models (LLMs) in a two-player adversarial language game called Adversarial Taboo. In this game, an attacker and a defender communicate around a target word only visible to the attacker. The attacker aims to induce the defender to speak the target word unconsciously, while the defender tries to infer the target word from the attacker's utterances. To win the game, both players must have sufficient knowledge about the target word and high-level reasoning ability to infer and express in this information-reserved conversation. Hence, we are curious about whether LLMs' reasoning ability can be further enhanced by Self-Playing this Adversarial language Game (SPAG). With this goal, we select several open-source LLMs and let each act as the attacker and play with a copy of itself as the defender on an extensive range of target words. Through reinforcement learning on the game outcomes, we observe that the LLMs' performances uniformly improve on a broad range of reasoning benchmarks. Furthermore, iteratively adopting this self-play process can continuously promote LLMs' reasoning abilities. The code is available at https://github.com/Linear95/SPAG.
title Self-playing Adversarial Language Game Enhances LLM Reasoning
topic Computation and Language
Machine Learning
url https://arxiv.org/abs/2404.10642