:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chou, Yuhong, Liu, Zehao, Zhu, Ruijie, Wan, Xinyi, Li, Tianjian, Chu, Congying, Liu, Qian, Wu, Jibin, Ma, Zejun
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2507.01004
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024)

AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
by: Chen, Qiaoling, et al.
Published: (2023)

Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)

Linear Attention Sequence Parallelism
by: Sun, Weigao, et al.
Published: (2024)

Zero Bubble Pipeline Parallelism
by: Qi, Penghui, et al.
Published: (2023)

ZePo: Zero-Shot Portrait Stylization with Faster Sampling
by: Liu, Jin, et al.
Published: (2024)

IML-Spikeformer: Input-aware Multi-Level Spiking Transformer for Speech Processing
by: Song, Zeyang, et al.
Published: (2025)

MDN: Parallelizing Stepwise Momentum for Delta Linear Attention
by: Huang, Yulong, et al.
Published: (2026)

Stochastic Attention: Connectome-Inspired Randomized Routing for Expressive Linear-Time Attention
by: Jin, Zehao, et al.
Published: (2026)

Gated Slot Attention for Efficient Linear-Time Sequence Modeling
by: Zhang, Yu, et al.
Published: (2024)

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
by: Liao, Bencheng, et al.
Published: (2024)

ZeQR: Zero-shot Query Reformulation for Conversational Search
by: Yang, Dayu, et al.
Published: (2023)

PMSN: A Parallel Multi-compartment Spiking Neuron for Multi-scale Temporal Processing
by: Chen, Xinyi, et al.
Published: (2024)

HEAR: An EEG Foundation Model with Heterogeneous Electrode Adaptive Representation
by: Chen, Zhige, et al.
Published: (2025)

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
by: Sun, Weigao, et al.
Published: (2025)

Low Overhead Beam Alignment for Mobile Millimeter Channel Based on Continuous-Time Prediction
by: Lin, Huang-Chou, et al.
Published: (2023)

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
by: Zeng, Weihao, et al.
Published: (2025)

A Systematic Analysis of Hybrid Linear Attention
by: Wang, Dustin, et al.
Published: (2025)

S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models
by: Young, Jack
Published: (2026)

ZeBROD: Zero-Retraining Based Recognition and Object Detection Framework
by: Hidayatullah, Priyanto, et al.
Published: (2025)

Zé Dirceu Memórias
by: Fernando Tadeu Germinatti
Published: (2020)

AsyncHZP: Hierarchical ZeRO Parallelism with Asynchronous Scheduling for Scalable LLM Training
by: Bai, Huawei, et al.
Published: (2025)

ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents
by: Liu, Tianjian, et al.
Published: (2025)

ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
by: Gummadi, Shreya, et al.
Published: (2025)

ZeST: Zero-Shot Material Transfer from a Single Image
by: Cheng, Ta-Ying, et al.
Published: (2024)

Bridging the Semantic Gap: An Ensemble Learning Framework With Textual Topic‐Raw Financial Feature Fusion to Enhance Fraud Detection in Chinese Markets
by: Congying Wei, et al.
Published: (2025)

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
by: Tan, Tao, et al.
Published: (2024)

HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline Parallelism
by: Zhang, Geng, et al.
Published: (2025)

Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024)

ZeFaV: Boosting Large Language Models for Zero-shot Fact Verification
by: Luu, Son T., et al.
Published: (2024)

RaZeR: Pushing the Limits of NVFP4 Quantization with Redundant Zero Remapping
by: Chen, Yuzong, et al.
Published: (2025)

Spatio-Temporal Decoupled Learning for Spiking Neural Networks
by: Ma, Chenxiang, et al.
Published: (2025)

GLU Attention Improve Transformer
by: Wang, Zehao
Published: (2025)

IMELL Cut Elimination with Linear Overhead
by: Accattoli, Beniamino, et al.
Published: (2024)

Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
by: Wang, Xinyu, et al.
Published: (2025)

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
by: He, Xinyi, et al.
Published: (2025)

HyLaT: Efficient Multi-Agent Communication via Hybrid Latent-Text Protocol
by: Mou, Xinyi, et al.
Published: (2026)

Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
by: Fan, Zehao, et al.
Published: (2025)

ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting
by: Jiang, Yankai, et al.
Published: (2023)

Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks
by: Qiu, Xuerui, et al.
Published: (2023)