Saved in:
Bibliographic Details
Main Author: Liu, Yizhi
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.23039
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908813953597440
author Liu, Yizhi
author_facet Liu, Yizhi
contents Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing $ε\to 0$ is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: \textbf{Premature Mode Collapse}. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as $O(1/ε)$. To address this, we propose \textbf{Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC)}, an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that EPH-ASC is essential for stabilizing Manifold-Constrained Hyper-Connections (mHC) during large-scale training on the FineWeb-Edu dataset, effectively preventing late-stage gradient explosions by enforcing a linear stability law.
format Preprint
id arxiv_https___arxiv_org_abs_2601_23039
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
Liu, Yizhi
Machine Learning
Artificial Intelligence
Differentiable matching layers and residual connection paradigms, often implemented via entropy-regularized Optimal Transport (OT), serve as critical mechanisms in structural prediction and architectural scaling. However, recovering discrete permutations or maintaining identity mappings via annealing $ε\to 0$ is notoriously unstable. In this work, we identify a fundamental mechanism for this failure: \textbf{Premature Mode Collapse}. By analyzing the non-normal dynamics of the Sinkhorn fixed-point map, we reveal a theoretical thermodynamic speed limit: standard exponential cooling outpaces the contraction rate of the inference operator, which degrades as $O(1/ε)$. To address this, we propose \textbf{Efficient Piecewise Hybrid Adaptive Stability Control (EPH-ASC)}, an adaptive scheduling algorithm that monitors the stability of the inference process. We demonstrate that EPH-ASC is essential for stabilizing Manifold-Constrained Hyper-Connections (mHC) during large-scale training on the FineWeb-Edu dataset, effectively preventing late-stage gradient explosions by enforcing a linear stability law.
title Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2601.23039