:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Alekseev, Sergey
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2604.11890
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exploring and Improving Initialization for Deep Graph Neural Networks: A Signal Propagation Perspective
by: Wang, Senmiao, et al.
Published: (2025)

Geometric Dynamics of Signal Propagation Predict Trainability of Transformers
by: Cowsik, Aditya, et al.
Published: (2024)

Stronger Normalization-Free Transformers
by: Chen, Mingzhi, et al.
Published: (2025)

DyTTP: Trajectory Prediction with Normalization-Free Transformers
by: Zhu, JianLin, et al.
Published: (2025)

Beyond Gaussian Initializations: Signal Preserving Weight Initialization for Odd-Sigmoid Activations
by: Lee, Hyunwoo, et al.
Published: (2025)

Beyond Oversquashing: Understanding Signal Propagation in GNNs Via Observables
by: Nagar, Eden, et al.
Published: (2026)

Convolutional Signal Propagation: A Simple Scalable Algorithm for Hypergraphs
by: Procházka, Pavel, et al.
Published: (2024)

Normalize Then Propagate: Efficient Homophilous Regularization for Few-shot Semi-Supervised Node Classification
by: Zhang, Baoming, et al.
Published: (2025)

FlashNorm: Fast Normalization for Transformers
by: Graef, Nils, et al.
Published: (2024)

No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
by: Kumar, Tanishq, et al.
Published: (2024)

Uncertainty Propagation in the Fast Fourier Transform
by: Schmid, Luca, et al.
Published: (2025)

Conditional Pseudo-Reversible Normalizing Flow for Surrogate Modeling in Quantifying Uncertainty Propagation
by: Yang, Minglei, et al.
Published: (2024)

ART: Artifact Removal Transformer for Reconstructing Noise-Free Multichannel Electroencephalographic Signals
by: Chuang, Chun-Hsiang, et al.
Published: (2024)

Normalized Matching Transformer
by: Pourhadi, Abtin, et al.
Published: (2025)

Nash Initialization for Recurrent Depth Transformers: Stable Signal Propagation at Initialization Without Layer Normalization
by: Bigeard, Nicolas
Published: (2026)

Learning in Compact Spaces with Approximately Normalized Transformer
by: Franke, Jörg K. H., et al.
Published: (2025)

Real-time Prediction of Urban Sound Propagation with Conditioned Normalizing Flows
by: Eckerle, Achim, et al.
Published: (2025)

Universal Learning of Stochastic Dynamics for Exact Belief Propagation using Bernstein Normalizing Flows
by: Amorese, Peter, et al.
Published: (2025)

Observable Propagation: Uncovering Feature Vectors in Transformers
by: Dunefsky, Jacob, et al.
Published: (2023)

Graph Propagation Transformer for Graph Representation Learning
by: Chen, Zhe, et al.
Published: (2023)

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation
by: Xu, Jie, et al.
Published: (2026)

The Free Transformer
by: Fleuret, François
Published: (2025)

Learning Rate Transfer in Normalized Transformers
by: Shigida, Boris, et al.
Published: (2026)

Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers
by: Saada, Thiziri Nait, et al.
Published: (2024)

UnitNorm: Rethinking Normalization for Transformers in Time Series
by: Huang, Nan, et al.
Published: (2024)

Neural Click Models for Recommender Systems
by: Shirokikh, Mikhail, et al.
Published: (2024)

Scalable Back-Propagation-Free Training of Optical Physics-Informed Neural Networks
by: Zhao, Yequan, et al.
Published: (2025)

How Long Does Infinite Width Last? Signal Propagation in Long-Range Linear Recurrences
by: Seleznova, Mariia
Published: (2026)

Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
by: Kinderman, Edan, et al.
Published: (2024)

Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
by: Zhang, Zhongwang, et al.
Published: (2024)

Local to Global: Learning Dynamics and Effect of Initialization for Transformers
by: Makkuva, Ashok Vardhan, et al.
Published: (2024)

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation
by: Fang, Zhenhan, et al.
Published: (2026)

Looped Transformers with Layer Normalization Provably Learn the Power Method
by: Wu, Lyumin, et al.
Published: (2026)

Free-form Flows: Make Any Architecture a Normalizing Flow
by: Draxler, Felix, et al.
Published: (2023)

Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs
by: Lin, Jiaqi, et al.
Published: (2025)

The Radio-Frequency Transformer for Signal Separation
by: Lifar, Egor, et al.
Published: (2026)

Stability of Transformers under Layer Normalization
by: Kan, Kelvin, et al.
Published: (2025)

Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences
by: Li, Siquan, et al.
Published: (2026)

Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models
by: Kedia, Akhil, et al.
Published: (2024)

Dataset-Free Weight-Initialization on Restricted Boltzmann Machine
by: Yasuda, Muneki, et al.
Published: (2024)