Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Carvalho, Jônata Tyska, Nolfi, Stefano
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Human-Computer Interaction Machine Learning Robotics
Online Access:	https://arxiv.org/abs/2506.04867
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908849152196608
author	Carvalho, Jônata Tyska Nolfi, Stefano
author_facet	Carvalho, Jônata Tyska Nolfi, Stefano
contents	We propose a method that enables large language models (LLMs) to control embodied agents through the generation of control policies that directly map continuous observation vectors to continuous action vectors. At the outset, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. The approach proves effective with relatively compact models such as GPT-oss:120b and Qwen2.5:72b. In most cases, it successfully identifies optimal or near-optimal solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_04867
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Sensory-Motor Control with Large Language Models via Iterative Policy Refinement Carvalho, Jônata Tyska Nolfi, Stefano Artificial Intelligence Human-Computer Interaction Machine Learning Robotics We propose a method that enables large language models (LLMs) to control embodied agents through the generation of control policies that directly map continuous observation vectors to continuous action vectors. At the outset, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. The approach proves effective with relatively compact models such as GPT-oss:120b and Qwen2.5:72b. In most cases, it successfully identifies optimal or near-optimal solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environment.
title	Sensory-Motor Control with Large Language Models via Iterative Policy Refinement
topic	Artificial Intelligence Human-Computer Interaction Machine Learning Robotics
url	https://arxiv.org/abs/2506.04867

Similar Items