Saved in:
Bibliographic Details
Main Authors: Gaidi, Safae, Slaoui, Abdallah, Falaki, Mohammed EL, Jaouadi, Amine
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.01252
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Non-Markovian memory effects in open quantum systems provide valuable resources for preserving coherence and enhancing controllability. However, exploiting them requires strategies adapted to history-dependent dynamics. We introduce a reinforcement-learning framework that autonomously learns to amplify information backflow in a driven two-level system coupled to a structured reservoir. Using a reward based on the positive time derivative of the trace distance associated with the Breuer-Laine-Piilo measure, we train PPO and SAC agents and benchmark their performance against gradient-based optimal control theory (OCT). While OCT enhances a single dominant backflow peak, RL policies broaden this revival and activate additional contributions in later memory windows, producing sustained positive trace-distance growth over a longer duration. Consequently, the integrated non-Markovianity achieved by RL substantially exceeds that obtained with OCT. These results demonstrate how long-horizon, model-free learning naturally uncovers distributed-backflow strategies and highlight the potential of reinforcement learning for engineering memory effects in open quantum systems.