:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Yi, Fang, Ruoyi, Xie, Anzhuo, Feng, Hanrui, Lai, Jianlin
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2511.12122
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Deep Learning Approach for Clinical Risk Identification Using Transformer Modeling of Heterogeneous EHR Data
by: Xie, Anzhuo, et al.
Published: (2025)

Application of Deep Generative Models for Anomaly Detection in Complex Financial Transactions
by: Tang, Tengda, et al.
Published: (2025)

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
by: Wang, Hanrui, et al.
Published: (2020)

A Deep Learning Approach to Anomaly Detection in High-Frequency Trading Data
by: Bao, Qiuliuyang, et al.
Published: (2025)

Enhancing Transformer Training Efficiency with Dynamic Dropout
by: Yan, Hanrui, et al.
Published: (2024)

ATM-GAD: Adaptive Temporal Motif Graph Anomaly Detection for Financial Transaction Networks
by: Zhang, Zeyue, et al.
Published: (2025)

Improving Transformers with Dynamically Composable Multi-Head Attention
by: Xiao, Da, et al.
Published: (2024)

Self-Attention Mechanism in Multimodal Context for Banking Transaction Flow
by: Delestre, Cyrile, et al.
Published: (2024)

Adaptive Head Budgeting for Efficient Multi-Head Attention
by: Faye, Bilal, et al.
Published: (2026)

Multi-Head Low-Rank Attention
by: Liu, Songtao, et al.
Published: (2026)

Stability and Generalization of Hypergraph Collaborative Networks
by: Ng, Michael, et al.
Published: (2023)

Dynamic Rank Reinforcement Learning for Adaptive Low-Rank Multi-Head Self Attention in Large Language Models
by: Erden, Caner
Published: (2025)

Hyperspectral Anomaly Detection with Self-Supervised Anomaly Prior
by: Liu, Yidan, et al.
Published: (2024)

Interpretable Hierarchical Attention Network for Medical Condition Identification
by: Fang, Dongping, et al.
Published: (2024)

Quantum Graph Attention Network: A Novel Quantum Multi-Head Attention Mechanism for Graph Learning
by: Ning, An, et al.
Published: (2025)

Unveiling Simplicities of Attention: Adaptive Long-Context Head Identification
by: Donhauser, Konstantin, et al.
Published: (2025)

Memorization Capacity of Multi-Head Attention in Transformers
by: Mahdavi, Sadegh, et al.
Published: (2023)

An Empirical Study of Multi-Generation Sampling for Jailbreak Detection in Large Language Models
by: Luo, Hanrui, et al.
Published: (2026)

MoH: Multi-Head Attention as Mixture-of-Head Attention
by: Jin, Peng, et al.
Published: (2024)

Anomaly Detection in High-Dimensional Bank Account Balances via Robust Methods
by: Maddanu, Federico, et al.
Published: (2025)

Unsupervised Graph Modeling for Anomaly Detection in Accounting Subject Relationships
by: Wang, Yuhan, et al.
Published: (2026)

Temporal-Aware Graph Attention Network for Cryptocurrency Transaction Fraud Detection
by: Zheng, Zhi, et al.
Published: (2025)

Multi-Head Spectral-Adaptive Graph Anomaly Detection
by: Cao, Qingyue, et al.
Published: (2025)

Dynamic Adaptive Shared Experts with Grouped Multi-Head Attention Mixture of Experts
by: Li, Cheng, et al.
Published: (2025)

Multi-Head Attention as a Source of Catastrophic Forgetting in MoE Transformers
by: Chen, Anrui, et al.
Published: (2026)

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention
by: He, Jianliang, et al.
Published: (2025)

Gradient Flow Structure and Quantitative Dynamics of Multi-Head Self-Attention
by: Pendharkar, Ayan
Published: (2026)

Interleaved Head Attention
by: Duvvuri, Sai Surya, et al.
Published: (2026)

Hybrid GCN-GRU Model for Anomaly Detection in Cryptocurrency Transactions
by: Na, Gyuyeon, et al.
Published: (2025)

Boosting House Price Estimations with Multi-Head Gated Attention
by: Sellam, Zakaria Abdellah, et al.
Published: (2024)

ProxyAttn: Guided Sparse Attention via Representative Heads
by: Wang, Yixuan, et al.
Published: (2025)

Molecular Odor Prediction Based on Multi-Feature Graph Attention Networks
by: Xie, HongXin, et al.
Published: (2025)

SIG: Efficient Self-Interpretable Graph Neural Network for Continuous-time Dynamic Graphs
by: Fang, Lanting, et al.
Published: (2024)

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality
by: Chen, Siyu, et al.
Published: (2024)

Geometric Analysis of Token Selection in Multi-Head Attention
by: Mudarisov, Timur, et al.
Published: (2026)

Superiority of Multi-Head Attention in In-Context Linear Regression
by: Cui, Yingqian, et al.
Published: (2024)

Quantum Mixed-State Self-Attention Network
by: Chen, Fu, et al.
Published: (2024)

DeepSTA: A Spatial-Temporal Attention Network for Logistics Delivery Timely Rate Prediction in Anomaly Conditions
by: Yi, Jinhui, et al.
Published: (2025)

Beyond Parallelism: Synergistic Computational Graph Effects in Multi-Head Attention
by: Borde, Haitz Sáez de Ocáriz
Published: (2025)

Multi-Head Self-Attending Neural Tucker Factorization
by: Hou, Yikai, et al.
Published: (2025)