Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autor principal:	Wang, Zhaoyue
Formato:	Preprint
Publicado:	2024
Materias:	Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2401.12459
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866910464616693760
author	Wang, Zhaoyue
author_facet	Wang, Zhaoyue
contents	When we design and deploy an Reinforcement Learning (RL) agent, reward functions motivates agents to achieve an objective. An incorrect or incomplete specification of the objective can result in behavior that does not align with human values - failing to adhere with social and moral norms that are ambiguous and context dependent, and cause undesired outcomes such as negative side effects and exploration that is unsafe. Previous work have manually defined reward functions to avoid negative side effects, use human oversight for safe exploration, or use foundation models as planning tools. This work studies the ability of leveraging Large Language Models (LLM)' understanding of morality and social norms on safe exploration augmented RL methods. This work evaluates language model's result against human feedbacks and demonstrates language model's capability as direct reward signals.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_12459
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Towards Socially and Morally Aware RL agent: Reward Design With LLM Wang, Zhaoyue Artificial Intelligence When we design and deploy an Reinforcement Learning (RL) agent, reward functions motivates agents to achieve an objective. An incorrect or incomplete specification of the objective can result in behavior that does not align with human values - failing to adhere with social and moral norms that are ambiguous and context dependent, and cause undesired outcomes such as negative side effects and exploration that is unsafe. Previous work have manually defined reward functions to avoid negative side effects, use human oversight for safe exploration, or use foundation models as planning tools. This work studies the ability of leveraging Large Language Models (LLM)' understanding of morality and social norms on safe exploration augmented RL methods. This work evaluates language model's result against human feedbacks and demonstrates language model's capability as direct reward signals.
title	Towards Socially and Morally Aware RL agent: Reward Design With LLM
topic	Artificial Intelligence
url	https://arxiv.org/abs/2401.12459

Ejemplares similares