Saved in:
Bibliographic Details
Main Authors: Kooi, Jacob E., Yang, Zhao, Hoogendoorn, Mark, François-Lavet, Vincent
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.09079
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918520391991296
author Kooi, Jacob E.
Yang, Zhao
Hoogendoorn, Mark
François-Lavet, Vincent
author_facet Kooi, Jacob E.
Yang, Zhao
Hoogendoorn, Mark
François-Lavet, Vincent
contents Deep reinforcement learning agents progressively lose representational capacity during training: neurons become dormant, removing active capacity from the network, and effective rank collapses, leaving surviving neurons redundant. Existing remedies such as periodic resets, and special neural network architectures, are largely algorithm- or domain-specific. We propose a simple architectural fix, the Hadamard Representation (HR), which replaces a standard hidden layer with the element-wise product of two independently parameterized layers. HR operates through two complementary mechanisms. First, it reduces the probability of a neuron becoming dormant, which is particularly valuable for continuously differentiable activations such as tanh: unlike dormant ReLU neurons, which are effectively pruned, saturated tanh neurons silently corrupt downstream layers by turning their outgoing weights into fixed biases. Second, independently of dormancy, the multiplicative structure captures richer feature interactions and increases effective rank without widening the layer. We evaluate HR across five algorithms and three domains: DQN, PPO, and PQN on pixel-based discrete-action Atari, SimbaV2 on state-based continuous control, and MR.Q on visual continuous control. HR consistently improves performance over the strong baselines without any hyperparameter tuning, and gains persist against parameter-matched wider variants, ruling out parameter count as an alternative explanation.
format Preprint
id arxiv_https___arxiv_org_abs_2406_09079
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Hadamard Representation: Scaffolding Performance Across Model-free RL
Kooi, Jacob E.
Yang, Zhao
Hoogendoorn, Mark
François-Lavet, Vincent
Machine Learning
Deep reinforcement learning agents progressively lose representational capacity during training: neurons become dormant, removing active capacity from the network, and effective rank collapses, leaving surviving neurons redundant. Existing remedies such as periodic resets, and special neural network architectures, are largely algorithm- or domain-specific. We propose a simple architectural fix, the Hadamard Representation (HR), which replaces a standard hidden layer with the element-wise product of two independently parameterized layers. HR operates through two complementary mechanisms. First, it reduces the probability of a neuron becoming dormant, which is particularly valuable for continuously differentiable activations such as tanh: unlike dormant ReLU neurons, which are effectively pruned, saturated tanh neurons silently corrupt downstream layers by turning their outgoing weights into fixed biases. Second, independently of dormancy, the multiplicative structure captures richer feature interactions and increases effective rank without widening the layer. We evaluate HR across five algorithms and three domains: DQN, PPO, and PQN on pixel-based discrete-action Atari, SimbaV2 on state-based continuous control, and MR.Q on visual continuous control. HR consistently improves performance over the strong baselines without any hyperparameter tuning, and gains persist against parameter-matched wider variants, ruling out parameter count as an alternative explanation.
title Hadamard Representation: Scaffolding Performance Across Model-free RL
topic Machine Learning
url https://arxiv.org/abs/2406.09079