Saved in:
Bibliographic Details
Main Authors: Young, Rory, Pugeault, Nicolas
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.23312
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915958450290688
author Young, Rory
Pugeault, Nicolas
author_facet Young, Rory
Pugeault, Nicolas
contents Deep reinforcement learning policies achieve strong performance in complex continuous control environments with nonlinear contact forces. However, these policies often produce chaotic state dynamics, with trivially small changes to the initial conditions significantly impacting the long-term behaviour of the control system. This high sensitivity to initial conditions limits the application of Deep RL to real-world control systems where performance and stability guarantees are often required. To address this issue, we propose Global stabilisation via Intrinsic Fine Tuning (GIFT), a general-purpose training framework which directly optimises the global stability of existing high-performing deep RL policies using a custom reward function. We demonstrate that GIFT increase the stability of the control interaction while maintaining comparable task performance, thereby improving the suitability of deep RL policies for real-world control systems.
format Preprint
id arxiv_https___arxiv_org_abs_2604_23312
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle GIFT: Global stabilisation via Intrinsic Fine Tuning
Young, Rory
Pugeault, Nicolas
Machine Learning
Artificial Intelligence
Deep reinforcement learning policies achieve strong performance in complex continuous control environments with nonlinear contact forces. However, these policies often produce chaotic state dynamics, with trivially small changes to the initial conditions significantly impacting the long-term behaviour of the control system. This high sensitivity to initial conditions limits the application of Deep RL to real-world control systems where performance and stability guarantees are often required. To address this issue, we propose Global stabilisation via Intrinsic Fine Tuning (GIFT), a general-purpose training framework which directly optimises the global stability of existing high-performing deep RL policies using a custom reward function. We demonstrate that GIFT increase the stability of the control interaction while maintaining comparable task performance, thereby improving the suitability of deep RL policies for real-world control systems.
title GIFT: Global stabilisation via Intrinsic Fine Tuning
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2604.23312