Saved in:
| Main Authors: | Chou, Yuhong, Liu, Zehao, Zhu, Ruijie, Wan, Xinyi, Li, Tianjian, Chu, Congying, Liu, Qian, Wu, Jibin, Ma, Zejun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.01004 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024)
by: Chou, Yuhong, et al.
Published: (2024)
AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
by: Chen, Qiaoling, et al.
Published: (2023)
by: Chen, Qiaoling, et al.
Published: (2023)
Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025)
by: Pan, Yuqi, et al.
Published: (2025)
Linear Attention Sequence Parallelism
by: Sun, Weigao, et al.
Published: (2024)
by: Sun, Weigao, et al.
Published: (2024)
Zero Bubble Pipeline Parallelism
by: Qi, Penghui, et al.
Published: (2023)
by: Qi, Penghui, et al.
Published: (2023)
ZePo: Zero-Shot Portrait Stylization with Faster Sampling
by: Liu, Jin, et al.
Published: (2024)
by: Liu, Jin, et al.
Published: (2024)
IML-Spikeformer: Input-aware Multi-Level Spiking Transformer for Speech Processing
by: Song, Zeyang, et al.
Published: (2025)
by: Song, Zeyang, et al.
Published: (2025)
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention
by: Huang, Yulong, et al.
Published: (2026)
by: Huang, Yulong, et al.
Published: (2026)
Stochastic Attention: Connectome-Inspired Randomized Routing for Expressive Linear-Time Attention
by: Jin, Zehao, et al.
Published: (2026)
by: Jin, Zehao, et al.
Published: (2026)
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
by: Zhang, Yu, et al.
Published: (2024)
by: Zhang, Yu, et al.
Published: (2024)
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
by: Liao, Bencheng, et al.
Published: (2024)
by: Liao, Bencheng, et al.
Published: (2024)
ZeQR: Zero-shot Query Reformulation for Conversational Search
by: Yang, Dayu, et al.
Published: (2023)
by: Yang, Dayu, et al.
Published: (2023)
PMSN: A Parallel Multi-compartment Spiking Neuron for Multi-scale Temporal Processing
by: Chen, Xinyi, et al.
Published: (2024)
by: Chen, Xinyi, et al.
Published: (2024)
HEAR: An EEG Foundation Model with Heterogeneous Electrode Adaptive Representation
by: Chen, Zhige, et al.
Published: (2025)
by: Chen, Zhige, et al.
Published: (2025)
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid
by: Sun, Weigao, et al.
Published: (2025)
by: Sun, Weigao, et al.
Published: (2025)
Low Overhead Beam Alignment for Mobile Millimeter Channel Based on Continuous-Time Prediction
by: Lin, Huang-Chou, et al.
Published: (2023)
by: Lin, Huang-Chou, et al.
Published: (2023)
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
by: Zeng, Weihao, et al.
Published: (2025)
by: Zeng, Weihao, et al.
Published: (2025)
A Systematic Analysis of Hybrid Linear Attention
by: Wang, Dustin, et al.
Published: (2025)
by: Wang, Dustin, et al.
Published: (2025)
S0 Tuning: Zero-Overhead Adaptation of Hybrid Recurrent-Attention Models
by: Young, Jack
Published: (2026)
by: Young, Jack
Published: (2026)
ZeBROD: Zero-Retraining Based Recognition and Object Detection Framework
by: Hidayatullah, Priyanto, et al.
Published: (2025)
by: Hidayatullah, Priyanto, et al.
Published: (2025)
Zé Dirceu Memórias
by: Fernando Tadeu Germinatti
Published: (2020)
by: Fernando Tadeu Germinatti
Published: (2020)
AsyncHZP: Hierarchical ZeRO Parallelism with Asynchronous Scheduling for Scalable LLM Training
by: Bai, Huawei, et al.
Published: (2025)
by: Bai, Huawei, et al.
Published: (2025)
ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents
by: Liu, Tianjian, et al.
Published: (2025)
by: Liu, Tianjian, et al.
Published: (2025)
ZeST: an LLM-based Zero-Shot Traversability Navigation for Unknown Environments
by: Gummadi, Shreya, et al.
Published: (2025)
by: Gummadi, Shreya, et al.
Published: (2025)
ZeST: Zero-Shot Material Transfer from a Single Image
by: Cheng, Ta-Ying, et al.
Published: (2024)
by: Cheng, Ta-Ying, et al.
Published: (2024)
Bridging the Semantic Gap: An Ensemble Learning Framework With Textual Topic‐Raw Financial Feature Fusion to Enhance Fraud Detection in Chinese Markets
by: Congying Wei, et al.
Published: (2025)
by: Congying Wei, et al.
Published: (2025)
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
by: Tan, Tao, et al.
Published: (2024)
by: Tan, Tao, et al.
Published: (2024)
HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline Parallelism
by: Zhang, Geng, et al.
Published: (2025)
by: Zhang, Geng, et al.
Published: (2025)
Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024)
by: Yeung, Man Tsung, et al.
Published: (2024)
ZeFaV: Boosting Large Language Models for Zero-shot Fact Verification
by: Luu, Son T., et al.
Published: (2024)
by: Luu, Son T., et al.
Published: (2024)
RaZeR: Pushing the Limits of NVFP4 Quantization with Redundant Zero Remapping
by: Chen, Yuzong, et al.
Published: (2025)
by: Chen, Yuzong, et al.
Published: (2025)
Spatio-Temporal Decoupled Learning for Spiking Neural Networks
by: Ma, Chenxiang, et al.
Published: (2025)
by: Ma, Chenxiang, et al.
Published: (2025)
GLU Attention Improve Transformer
by: Wang, Zehao
Published: (2025)
by: Wang, Zehao
Published: (2025)
IMELL Cut Elimination with Linear Overhead
by: Accattoli, Beniamino, et al.
Published: (2024)
by: Accattoli, Beniamino, et al.
Published: (2024)
Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?
by: He, Xinyi, et al.
Published: (2025)
by: He, Xinyi, et al.
Published: (2025)
HyLaT: Efficient Multi-Agent Communication via Hybrid Latent-Text Protocol
by: Mou, Xinyi, et al.
Published: (2026)
by: Mou, Xinyi, et al.
Published: (2026)
Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
by: Fan, Zehao, et al.
Published: (2025)
by: Fan, Zehao, et al.
Published: (2025)
ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting
by: Jiang, Yankai, et al.
Published: (2023)
by: Jiang, Yankai, et al.
Published: (2023)
Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks
by: Qiu, Xuerui, et al.
Published: (2023)
by: Qiu, Xuerui, et al.
Published: (2023)
Similar Items
-
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
by: Chou, Yuhong, et al.
Published: (2024) -
AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
by: Chen, Qiaoling, et al.
Published: (2023) -
Scaling Linear Attention with Sparse State Expansion
by: Pan, Yuqi, et al.
Published: (2025) -
Linear Attention Sequence Parallelism
by: Sun, Weigao, et al.
Published: (2024) -
Zero Bubble Pipeline Parallelism
by: Qi, Penghui, et al.
Published: (2023)