Saved in:
| Main Authors: | Fang, Tongcheng, Zhang, Hanling, Xie, Ruiqi, Han, Zhuo, Tao, Xin, Zhao, Tianchen, Wan, Pengfei, Ding, Wenbo, Ouyang, Wanli, Ning, Xuefei, Wang, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.16515 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DiTFastAttn: Attention Compression for Diffusion Transformer Models
by: Yuan, Zhihang, et al.
Published: (2024)
by: Yuan, Zhihang, et al.
Published: (2024)
VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models
by: Zhang, Hanling, et al.
Published: (2025)
by: Zhang, Hanling, et al.
Published: (2025)
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
by: Zhao, Tianchen, et al.
Published: (2024)
by: Zhao, Tianchen, et al.
Published: (2024)
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
by: Xie, Rui, et al.
Published: (2024)
by: Xie, Rui, et al.
Published: (2024)
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
by: Zhao, Tianchen, et al.
Published: (2024)
by: Zhao, Tianchen, et al.
Published: (2024)
DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation
by: Yuan, Zhihang, et al.
Published: (2025)
by: Yuan, Zhihang, et al.
Published: (2025)
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention
by: Zhang, Jintao, et al.
Published: (2025)
by: Zhang, Jintao, et al.
Published: (2025)
VMoBA: Mixture-of-Block Attention for Video Diffusion Models
by: Wu, Jianzong, et al.
Published: (2025)
by: Wu, Jianzong, et al.
Published: (2025)
Agent Attention: On the Integration of Softmax and Linear Attention
by: Han, Dongchen, et al.
Published: (2023)
by: Han, Dongchen, et al.
Published: (2023)
Attention Surgery: An Efficient Recipe to Linearize Your Video Diffusion Transformer
by: Ghafoorian, Mohsen, et al.
Published: (2025)
by: Ghafoorian, Mohsen, et al.
Published: (2025)
VMonarch: Efficient Video Diffusion Transformers with Structured Attention
by: Liang, Cheng, et al.
Published: (2026)
by: Liang, Cheng, et al.
Published: (2026)
db-SP: Accelerating Sparse Attention for Visual Generative Models with Dual-Balanced Sequence Parallelism
by: Chen, Siqi, et al.
Published: (2025)
by: Chen, Siqi, et al.
Published: (2025)
Spark Transformer: Reactivating Sparsity in FFN and Attention
by: You, Chong, et al.
Published: (2025)
by: You, Chong, et al.
Published: (2025)
Analysis of Attention in Video Diffusion Transformers
by: Wen, Yuxin, et al.
Published: (2025)
by: Wen, Yuxin, et al.
Published: (2025)
DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
by: Zhang, Hanling, et al.
Published: (2025)
by: Zhang, Hanling, et al.
Published: (2025)
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation
by: Yang, Lianwei, et al.
Published: (2025)
by: Yang, Lianwei, et al.
Published: (2025)
Scaling Attention via Feature Sparsity
by: Xie, Yan, et al.
Published: (2026)
by: Xie, Yan, et al.
Published: (2026)
Crisp Attention: Regularizing Transformers via Structured Sparsity
by: Gandhi, Sagar, et al.
Published: (2025)
by: Gandhi, Sagar, et al.
Published: (2025)
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
by: Yuan, Zhihang, et al.
Published: (2024)
by: Yuan, Zhihang, et al.
Published: (2024)
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
by: Ding, Hangliang, et al.
Published: (2025)
by: Ding, Hangliang, et al.
Published: (2025)
EDiT: Efficient Diffusion Transformers with Linear Compressed Attention
by: Becker, Philipp, et al.
Published: (2025)
by: Becker, Philipp, et al.
Published: (2025)
RaaS: Reasoning-Aware Attention Sparsity for Efficient LLM Reasoning
by: Hu, Junhao, et al.
Published: (2025)
by: Hu, Junhao, et al.
Published: (2025)
Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs
by: Tu, Chongjun, et al.
Published: (2025)
by: Tu, Chongjun, et al.
Published: (2025)
STS: Efficient Sparse Attention with Speculative Token Sparsity
by: Xu, Ceyu, et al.
Published: (2026)
by: Xu, Ceyu, et al.
Published: (2026)
VORTA: Efficient Video Diffusion via Routing Sparse Attention
by: Sun, Wenhao, et al.
Published: (2025)
by: Sun, Wenhao, et al.
Published: (2025)
Cache-to-Cache: Direct Semantic Communication Between Large Language Models
by: Fu, Tianyu, et al.
Published: (2025)
by: Fu, Tianyu, et al.
Published: (2025)
AttentionBender: Manipulating Cross-Attention in Video Diffusion Transformers as a Creative Probe
by: Cole, Adam, et al.
Published: (2026)
by: Cole, Adam, et al.
Published: (2026)
SATA: Sparsity-Aware Scheduling for Selective Token Attention
by: Fan, Zhenkun, et al.
Published: (2026)
by: Fan, Zhenkun, et al.
Published: (2026)
Attention Sparsity is Input-Stable: Training-Free Sparse Attention for Video Generation via Offline Sparsity Profiling and Online QK Co-Clustering
by: Luo, Jiayi, et al.
Published: (2026)
by: Luo, Jiayi, et al.
Published: (2026)
Linear Attention is Enough in Spatial-Temporal Forecasting
by: Ning, Xinyu
Published: (2024)
by: Ning, Xinyu
Published: (2024)
Heat Diffusion Models -- Interpixel Attention Mechanism
by: Zhang, Pengfei, et al.
Published: (2025)
by: Zhang, Pengfei, et al.
Published: (2025)
Rectified SpaAttn: Revisiting Attention Sparsity for Efficient Video Generation
by: Liu, Xuewen, et al.
Published: (2025)
by: Liu, Xuewen, et al.
Published: (2025)
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers
by: Tang, Zecheng, et al.
Published: (2026)
by: Tang, Zecheng, et al.
Published: (2026)
Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion Models
by: Novello, Nicola, et al.
Published: (2026)
by: Novello, Nicola, et al.
Published: (2026)
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance
by: Shen, Xuan, et al.
Published: (2025)
by: Shen, Xuan, et al.
Published: (2025)
ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers
by: Ghafoorian, Mohsen, et al.
Published: (2026)
by: Ghafoorian, Mohsen, et al.
Published: (2026)
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers
by: Chen, Pengtao, et al.
Published: (2025)
by: Chen, Pengtao, et al.
Published: (2025)
Transolver is a Linear Transformer: Revisiting Physics-Attention through the Lens of Linear Attention
by: Hu, Wenjie, et al.
Published: (2025)
by: Hu, Wenjie, et al.
Published: (2025)
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
by: Xie, Kevin, et al.
Published: (2025)
by: Xie, Kevin, et al.
Published: (2025)
Cottention: Linear Transformers With Cosine Attention
by: Mongaras, Gabriel, et al.
Published: (2024)
by: Mongaras, Gabriel, et al.
Published: (2024)
Similar Items
-
DiTFastAttn: Attention Compression for Diffusion Transformer Models
by: Yuan, Zhihang, et al.
Published: (2024) -
VocabTailor: Dynamic Vocabulary Selection for Downstream Tasks in Small Language Models
by: Zhang, Hanling, et al.
Published: (2025) -
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
by: Zhao, Tianchen, et al.
Published: (2024) -
LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization
by: Xie, Rui, et al.
Published: (2024) -
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization
by: Zhao, Tianchen, et al.
Published: (2024)