:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hayakawa, Daichi, Sato, Issei
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2410.12413
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Length Generalization of Causal Transformers without Position Encoding
by: Wang, Jie, et al.
Published: (2024)

On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
by: Xu, Kevin, et al.
Published: (2024)

Rethinking Associative Memory Mechanism in Induction Head
by: Wang, Shuo, et al.
Published: (2024)

Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection
by: Fujikawa, Shota, et al.
Published: (2026)

A Formal Comparison Between Chain of Thought and Latent Thought
by: Xu, Kevin, et al.
Published: (2025)

On the Geometry of Positional Encodings in Transformers
by: Cirrincione, Giansalvo
Published: (2026)

ExPe: Exact Positional Encodings for Generative Transformer Models with Extrapolating Capabilities
by: Datseris, Aleksis, et al.
Published: (2025)

Fusion Matters: Length-Aware Analysis of Positional-Encoding Fusion in Transformers
by: Hallam, Mohamed Amine, et al.
Published: (2026)

Theoretical Analysis of Byte-Pair Encoding
by: Kozma, László, et al.
Published: (2024)

SeqPE: Transformer with Sequential Position Encoding
by: Li, Huayang, et al.
Published: (2025)

Length Extrapolation of Transformers: A Survey from the Perspective of Positional Encoding
by: Zhao, Liang, et al.
Published: (2023)

Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings
by: Zuo, Chunsheng, et al.
Published: (2024)

PaTH Attention: Position Encoding via Accumulating Householder Transformations
by: Yang, Songlin, et al.
Published: (2025)

Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph
by: Irie, Kazuki
Published: (2024)

Beyond Sinusoids: A Morlet Wavelet Framework for Transformer Positional Encoding
by: Zeris, Athanasios
Published: (2026)

Understanding Transformer Optimization via Gradient Heterogeneity
by: Tomihari, Akiyoshi, et al.
Published: (2025)

How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
by: Huang, Ruiquan, et al.
Published: (2025)

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition
by: Wang, Yong, et al.
Published: (2024)

Hierarchical Bracketing Encodings Work for Dependency Graphs
by: Ezquerro, Ana, et al.
Published: (2025)

Hierarchical Bracketing Encodings for Dependency Parsing as Tagging
by: Ezquerro, Ana, et al.
Published: (2025)

A Morphology-Based Investigation of Positional Encodings
by: Ghosh, Poulami, et al.
Published: (2024)

Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis
by: Li, Hongkang, et al.
Published: (2024)

Group Representational Position Encoding
by: Zhang, Yifan, et al.
Published: (2025)

DAPE: Data-Adaptive Positional Encoding for Length Extrapolation
by: Zheng, Chuanyang, et al.
Published: (2024)

Energy-Gated Attention and Wavelet Positional Encoding: Complementary Inductive Biases for Transformer Attention
by: Zeris, Athanasios
Published: (2026)

2D-TPE: Two-Dimensional Positional Encoding Enhances Table Understanding for Large Language Models
by: Li, Jia-Nan, et al.
Published: (2024)

KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models
by: Bai, Yuyang, et al.
Published: (2023)

Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer
by: Zhu, Yongxin, et al.
Published: (2024)

On the Encoding of Gender in Transformer-based ASR Representations
by: Krishnan, Aravind, et al.
Published: (2024)

Encoding Hierarchical Schema via Concept Flow for Multifaceted Ideology Detection
by: Liu, Songtao, et al.
Published: (2024)

Contextual Position Encoding: Learning to Count What's Important
by: Golovneva, Olga, et al.
Published: (2024)

Positional Encoding via Token-Aware Phase Attention
by: Wang, Yu, et al.
Published: (2025)

On the Interplay between Positional Encodings, Morphological Complexity, and Word Order Flexibility
by: Tatariya, Kushal, et al.
Published: (2025)

Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling
by: Wang, Zhenghua, et al.
Published: (2025)

Theoretical Analysis of Weak-to-Strong Generalization
by: Lang, Hunter, et al.
Published: (2024)

CMATH: Cross-Modality Augmented Transformer with Hierarchical Variational Distillation for Multimodal Emotion Recognition in Conversation
by: Zhu, Xiaofei, et al.
Published: (2024)

Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis
by: Diera, Andor, et al.
Published: (2026)

Text-Based Correlation Matrix in Multi-Asset Allocation
by: Nakayama, Yasuhiro, et al.
Published: (2024)

PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models
by: Aggarwal, Arpit
Published: (2024)

Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
by: Rodrigues, João, et al.
Published: (2023)