:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Zihan, Qin, Bo-Wei, Du, Kai, Lin, Wei
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.19816
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training
by: Mistry, Deven Mahesh, et al.
Published: (2025)

Attention Dispersion in Dynamic Graph Transformers: Diagnosis and a Transferable Fix
by: Zhang, Jinhao, et al.
Published: (2026)

Correlation-Attention Masked Temporal Transformer for User Identity Linkage Using Heterogeneous Mobility Data
by: Yan, Ziang, et al.
Published: (2025)

What Matters in Transformers? Not All Attention is Needed
by: He, Shwai, et al.
Published: (2024)

Siamese Multiple Attention Temporal Convolution Networks for Human Mobility Signature Identification
by: Zheng, Zhipeng, et al.
Published: (2024)

LiteAttention: A Temporal Sparse Attention for Diffusion Transformers
by: Shmilovich, Dor, et al.
Published: (2025)

DRL-TH: Jointly Utilizing Temporal Graph Attention and Hierarchical Fusion for UGV Navigation in Crowded Environments
by: Li, Ruitong, et al.
Published: (2025)

Physics-informed Attention-enhanced Fourier Neural Operator for Solar Magnetic Field Extrapolations
by: Cao, Jinghao, et al.
Published: (2025)

Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
by: Wang, Jiapu, et al.
Published: (2024)

Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model
by: Wu, Hao, et al.
Published: (2023)

Context and Diversity Matter: The Emergence of In-Context Learning in World Models
by: Wang, Fan, et al.
Published: (2025)

Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
by: Adhikari, Rabin
Published: (2025)

DST-GTN: Dynamic Spatio-Temporal Graph Transformer Network for Traffic Forecasting
by: Huang, Songtao, et al.
Published: (2024)

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
by: Tian, Yuandong, et al.
Published: (2023)

Learning from Complexity: Exploring Dynamic Sample Pruning of Spatio-Temporal Training
by: Chen, Wei, et al.
Published: (2026)

Probing Routing-Conditional Calibration in Attention-Residual Transformers
by: Liang, Wenhao, et al.
Published: (2026)

Game-Time: Evaluating Temporal Dynamics in Spoken Language Models
by: Chang, Kai-Wei, et al.
Published: (2025)

MSVIT: Improving Spiking Vision Transformer Using Multi-scale Attention Fusion
by: Hua, Wei, et al.
Published: (2025)

A Distributed Hierarchical Spatio-Temporal Edge-Enhanced Graph Neural Network for City-Scale Dynamic Logistics Routing
by: Han, Zihan, et al.
Published: (2025)

TimeMM: Time-as-Operator Spectral Filtering for Dynamic Multimodal Recommendation
by: Yang, Wei, et al.
Published: (2026)

TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles
by: Xu, Yaoyao, et al.
Published: (2025)

A Framework for Analyzing Abnormal Emergence in Service Ecosystems Through LLM-based Agent Intention Mining
by: Shen, Yifan, et al.
Published: (2025)

A Hierarchical Framework with Spatio-Temporal Consistency Learning for Emergence Detection in Complex Adaptive Systems
by: Chen, Siyuan, et al.
Published: (2024)

Attention Basin: Why Contextual Position Matters in Large Language Models
by: Yi, Zihao, et al.
Published: (2025)

On the Emergence of Syntax by Means of Local Interaction
by: Wei, Zichao
Published: (2026)

Neural Dynamics Self-Attention for Spiking Transformers
by: Zhang, Dehao, et al.
Published: (2026)

Transformer for Object Re-Identification: A Survey
by: Ye, Mang, et al.
Published: (2024)

Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning
by: Huang, Rikui, et al.
Published: (2026)

Dynamic Topic Evolution with Temporal Decay and Attention in Large Language Models
by: Wu, Di, et al.
Published: (2025)

Patch Hierarchical Attention Transformer for Efficient Particle Jet Tagging
by: Wang, Aaron, et al.
Published: (2026)

Pay Attention to What Matters
by: Silva, Pedro Luiz, et al.
Published: (2024)

Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
by: Zheng, Haoyang, et al.
Published: (2024)

STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation
by: Wang, Yiming, et al.
Published: (2025)

Attn-QAT: 4-Bit Attention With Quantization-Aware Training
by: Zhang, Peiyuan, et al.
Published: (2026)

C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling
by: Qin, Jin, et al.
Published: (2025)

State Rank Dynamics in Linear Attention LLMs
by: Sun, Ao, et al.
Published: (2026)

Unveiling and Controlling Anomalous Attention Distribution in Transformers
by: Yan, Ruiqing, et al.
Published: (2024)

Temporal-Aware Graph Attention Network for Cryptocurrency Transaction Fraud Detection
by: Zheng, Zhi, et al.
Published: (2025)

On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning Paradigm
by: Zhou, Zhanpeng, et al.
Published: (2024)

Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
by: Yu, Xinyao, et al.
Published: (2024)