Saved in:
| Main Authors: | He, Weiyi, Xing, Yue |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.09275 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
by: Ma, Qian, et al.
Published: (2025)
by: Ma, Qian, et al.
Published: (2025)
On Rademacher Complexity-based Generalization Bounds for Deep Learning
by: Truong, Lan V.
Published: (2022)
by: Truong, Lan V.
Published: (2022)
Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization
by: Xiao, Jiancong, et al.
Published: (2024)
by: Xiao, Jiancong, et al.
Published: (2024)
Superiority of Multi-Head Attention in In-Context Linear Regression
by: Cui, Yingqian, et al.
Published: (2024)
by: Cui, Yingqian, et al.
Published: (2024)
On the Geometry of Positional Encodings in Transformers
by: Cirrincione, Giansalvo
Published: (2026)
by: Cirrincione, Giansalvo
Published: (2026)
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
by: Choromanski, Krzysztof Marcin, et al.
Published: (2023)
by: Choromanski, Krzysztof Marcin, et al.
Published: (2023)
Improving Transformers using Faithful Positional Encoding
by: Idé, Tsuyoshi, et al.
Published: (2024)
by: Idé, Tsuyoshi, et al.
Published: (2024)
Comparing Graph Transformers via Positional Encodings
by: Black, Mitchell, et al.
Published: (2024)
by: Black, Mitchell, et al.
Published: (2024)
Transformers Learn Robust In-Context Regression under Distributional Uncertainty
by: Cao, Hoang T. H., et al.
Published: (2026)
by: Cao, Hoang T. H., et al.
Published: (2026)
Graph Transformers without Positional Encodings
by: Garg, Ayush
Published: (2024)
by: Garg, Ayush
Published: (2024)
Rademacher Complexity of Neural ODEs via Chen-Fliess Series
by: Hanson, Joshua, et al.
Published: (2024)
by: Hanson, Joshua, et al.
Published: (2024)
Size Transferability of Graph Transformers with Convolutional Positional Encodings
by: Porras-Valenzuela, Javier, et al.
Published: (2026)
by: Porras-Valenzuela, Javier, et al.
Published: (2026)
Benchmarking Positional Encodings for GNNs and Graph Transformers
by: Grötschla, Florian, et al.
Published: (2024)
by: Grötschla, Florian, et al.
Published: (2024)
A Gapped Scale-Sensitive Dimension and Lower Bounds for Offset Rademacher Complexity
by: Jia, Zeyu, et al.
Published: (2025)
by: Jia, Zeyu, et al.
Published: (2025)
DAM-GT: Dual Positional Encoding-Based Attention Masking Graph Transformer for Node Classification
by: Li, Chenyang, et al.
Published: (2025)
by: Li, Chenyang, et al.
Published: (2025)
CoPE: A Lightweight Complex Positional Encoding
by: Amballa, Avinash
Published: (2025)
by: Amballa, Avinash
Published: (2025)
Tab-PET: Graph-Based Positional Encodings for Tabular Transformers
by: Leng, Yunze, et al.
Published: (2025)
by: Leng, Yunze, et al.
Published: (2025)
Dynamic Graph Transformer with Correlated Spatial-Temporal Positional Encoding
by: Wang, Zhe, et al.
Published: (2024)
by: Wang, Zhe, et al.
Published: (2024)
Provable In-Context Learning of Nonlinear Regression with Transformers
by: Li, Hongbo, et al.
Published: (2025)
by: Li, Hongbo, et al.
Published: (2025)
Lean Formalization of Generalization Error Bound by Rademacher Complexity and Dudley's Entropy Integral
by: Sonoda, Sho, et al.
Published: (2025)
by: Sonoda, Sho, et al.
Published: (2025)
Revisiting the Relationship between Adversarial and Clean Training: Why Clean Training Can Make Adversarial Training Better
by: Zhou, MingWei, et al.
Published: (2025)
by: Zhou, MingWei, et al.
Published: (2025)
Graph Transformer with Disease Subgraph Positional Encoding for Improved Comorbidity Prediction
by: Qin, Xihan, et al.
Published: (2025)
by: Qin, Xihan, et al.
Published: (2025)
Positional Encoding in Transformer-Based Time Series Models: A Survey
by: Irani, Habib, et al.
Published: (2025)
by: Irani, Habib, et al.
Published: (2025)
Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers
by: Shimizu, Atsushi, et al.
Published: (2026)
by: Shimizu, Atsushi, et al.
Published: (2026)
Rademacher Meets Colors: More Expressivity, but at What Cost ?
by: Carrasco, Martin, et al.
Published: (2025)
by: Carrasco, Martin, et al.
Published: (2025)
One-Step Early Stopping Strategy using Neural Tangent Kernel Theory and Rademacher Complexity
by: Xavier, Daniel Martin, et al.
Published: (2024)
by: Xavier, Daniel Martin, et al.
Published: (2024)
Dynamics of Transient Structure in In-Context Linear Regression Transformers
by: Carroll, Liam, et al.
Published: (2025)
by: Carroll, Liam, et al.
Published: (2025)
Do we really need the Rademacher complexities?
by: Bartl, Daniel, et al.
Published: (2025)
by: Bartl, Daniel, et al.
Published: (2025)
Resilience of Rademacher chaos of low degree
by: Aigner-Horev, Elad, et al.
Published: (2024)
by: Aigner-Horev, Elad, et al.
Published: (2024)
On the Clean Generalization and Robust Overfitting in Adversarial Training from Two Theoretical Views: Representation Complexity and Training Dynamics
by: Li, Binghui, et al.
Published: (2023)
by: Li, Binghui, et al.
Published: (2023)
Ensuring Calibration Robustness in Split Conformal Prediction Under Adversarial Attacks
by: Qian, Xunlei, et al.
Published: (2025)
by: Qian, Xunlei, et al.
Published: (2025)
Disentangled and Distilled Encoder for Out-of-Distribution Reasoning with Rademacher Guarantees
by: Rahiminasab, Zahra, et al.
Published: (2025)
by: Rahiminasab, Zahra, et al.
Published: (2025)
Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings
by: Zuo, Chunsheng, et al.
Published: (2024)
by: Zuo, Chunsheng, et al.
Published: (2024)
A Simple Reduction Scheme for Constrained Contextual Bandits with Adversarial Contexts via Regression
by: Sarkar, Dhruv, et al.
Published: (2026)
by: Sarkar, Dhruv, et al.
Published: (2026)
Position: The Turing-Completeness of Autoregressive Transformers Relies Heavily on Context Management
by: Cui, Guanyu, et al.
Published: (2026)
by: Cui, Guanyu, et al.
Published: (2026)
PaTH Attention: Position Encoding via Accumulating Householder Transformations
by: Yang, Songlin, et al.
Published: (2025)
by: Yang, Songlin, et al.
Published: (2025)
Fusion Matters: Length-Aware Analysis of Positional-Encoding Fusion in Transformers
by: Hallam, Mohamed Amine, et al.
Published: (2026)
by: Hallam, Mohamed Amine, et al.
Published: (2026)
Algebraic Positional Encodings
by: Kogkalidis, Konstantinos, et al.
Published: (2023)
by: Kogkalidis, Konstantinos, et al.
Published: (2023)
Auto-Encoding Adversarial Imitation Learning
by: Zhang, Kaifeng, et al.
Published: (2022)
by: Zhang, Kaifeng, et al.
Published: (2022)
Transformers Don't In-Context Learn Least Squares Regression
by: Hill, Joshua, et al.
Published: (2025)
by: Hill, Joshua, et al.
Published: (2025)
Similar Items
-
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
by: Ma, Qian, et al.
Published: (2025) -
On Rademacher Complexity-based Generalization Bounds for Deep Learning
by: Truong, Lan V.
Published: (2022) -
Bridging the Gap: Rademacher Complexity in Robust and Standard Generalization
by: Xiao, Jiancong, et al.
Published: (2024) -
Superiority of Multi-Head Attention in In-Context Linear Regression
by: Cui, Yingqian, et al.
Published: (2024) -
On the Geometry of Positional Encodings in Transformers
by: Cirrincione, Giansalvo
Published: (2026)