:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Mingze, E, Weinan
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.00522
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks
by: Wang, Mingze, et al.
Published: (2025)

How Transformers Get Rich: Approximation and Dynamics Analysis
by: Wang, Mingze, et al.
Published: (2024)

GradPower: Powering Gradients for Faster Language Model Pre-Training
by: Wang, Jinbo, et al.
Published: (2025)

The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
by: Wang, Jinbo, et al.
Published: (2025)

On the Expressive Power of Floating-Point Transformers
by: Park, Sejun, et al.
Published: (2026)

On the Expressive Power of Contextual Relations in Transformers
by: Fraiman, Demián
Published: (2026)

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models
by: Cooper, John, et al.
Published: (2026)

More Expressive Feedforward Layers: Part I. Token-Adaptive Mixing of Activations
by: Wang, Mingze, et al.
Published: (2026)

Transformers are Expressive, But Are They Expressive Enough for Regression?
by: Nath, Swaroop, et al.
Published: (2024)

Exact Expressive Power of Transformers with Padding
by: Merrill, William, et al.
Published: (2025)

The Expressive Power of Transformers with Chain of Thought
by: Merrill, William, et al.
Published: (2023)

Understanding and Enhancing Mask-Based Pretraining towards Universal Representations
by: Dong, Mingze, et al.
Published: (2025)

Towards Understanding the Expressive Power of GNNs with Global Readout
by: Funk, Maurice, et al.
Published: (2026)

Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models
by: DiGiugno, Andrew, et al.
Published: (2025)

On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers
by: Zhou, Cai, et al.
Published: (2024)

On The Expressive Power of GNN Derivatives
by: Eitan, Yam, et al.
Published: (2025)

Lyra: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences
by: Ramesh, Krithik, et al.
Published: (2025)

On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
by: Xu, Kevin, et al.
Published: (2024)

Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
by: Walker, Benjamin, et al.
Published: (2025)

On the Expressive Power of Transformers for Maxout Networks and Continuous Piecewise Linear Functions
by: Gu, Linyan, et al.
Published: (2026)

Expressive Power of Temporal Message Passing
by: Wałęga, Przemysław Andrzej, et al.
Published: (2024)

Rethinking the Expressive Power of GNNs via Graph Biconnectivity
by: Zhang, Bohang, et al.
Published: (2023)

Expanding Expressivity in Transformer Models with MöbiusAttention
by: Halacheva, Anna-Maria, et al.
Published: (2024)

The Expressive Power of Low Precision Softmax Transformers with (Summarized) Chain-of-Thought
by: Brösamle, Moritz, et al.
Published: (2026)

GNNs Meet Sequence Models Along the Shortest-Path: an Expressive Method for Link Prediction
by: Ferrini, Francesco, et al.
Published: (2025)

On the Expressive Power of GNNs to Solve Linear SDPs
by: Qian, Chendi, et al.
Published: (2026)

Understanding Expressivity of GNN in Rule Learning
by: Qiu, Haiquan, et al.
Published: (2023)

k-Maximum Inner Product Attention for Graph Transformers and the Expressive Power of GraphGPS
by: De Schouwer, Jonas, et al.
Published: (2026)

On the Expressive Power of Graph Neural Networks
by: Nalwade, Ashwin, et al.
Published: (2024)

On the Expressive Power of Sparse Geometric MPNNs
by: Sverdlov, Yonatan, et al.
Published: (2024)

On the Expressive Power of GNNs for Boolean Satisfiability
by: Peltonen, Saku, et al.
Published: (2026)

Improving Generalization and Convergence by Enhancing Implicit Regularization
by: Wang, Mingze, et al.
Published: (2024)

On the Expressive Power of Subgraph Graph Neural Networks for Graphs with Bounded Cycles
by: Chen, Ziang, et al.
Published: (2025)

Expressivity of Transformers: A Tropical Geometry Perspective
by: Su, Ye, et al.
Published: (2026)

On the Expressive Power and Limitations of Multi-Layer SSMs
by: Zubić, Nikola, et al.
Published: (2026)

On the Expressive Power of Permutation-Equivariant Weight-Space Networks
by: Dayan, Adir, et al.
Published: (2026)

Maximising Quantum-Computing Expressive Power through Randomised Circuits
by: Yang, Yingli, et al.
Published: (2023)

A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
by: Merrill, William, et al.
Published: (2025)

A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
by: Wang, Mingze, et al.
Published: (2023)

On the Expressive Power of Tree-Structured Probabilistic Circuits
by: Yin, Lang, et al.
Published: (2024)