Saved in:
| Main Authors: | Xu, Chao, Li, Maohua, Li, Qirui, Xu, Yixuan, Zhou, Yanke, Li, Yunhe, Shen, Cuifeng, Tang, Hanlin, Liu, Kan, Lan, Tao, Qu, Lin, Zhang, Shao-Qun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.20708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps
by: Zhou, Yanke, et al.
Published: (2026)
by: Zhou, Yanke, et al.
Published: (2026)
RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models
by: Cong, Xing, et al.
Published: (2026)
by: Cong, Xing, et al.
Published: (2026)
Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models
by: Li, Kesong, et al.
Published: (2026)
by: Li, Kesong, et al.
Published: (2026)
MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers
by: Ding, Ning, et al.
Published: (2024)
by: Ding, Ning, et al.
Published: (2024)
Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context
by: Shen, Cuifeng, et al.
Published: (2025)
by: Shen, Cuifeng, et al.
Published: (2025)
Shiva-DiT: Residual-Based Differentiable Top-$k$ Selection for Efficient Diffusion Transformers
by: Zhang, Jiaji, et al.
Published: (2026)
by: Zhang, Jiaji, et al.
Published: (2026)
MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols
by: Yang, Yixuan, et al.
Published: (2025)
by: Yang, Yixuan, et al.
Published: (2025)
Association of Oral Microbiome Diversity With Depression Status: NHANES 2009–2012
by: Cuifeng Zhang, et al.
Published: (2025)
by: Cuifeng Zhang, et al.
Published: (2025)
K-Field Routing: Cross-Layer Causal Signals for Multi-Hop Reasoning in Transformers
by: Li, Y.Y.N.
Published: (2026)
by: Li, Y.Y.N.
Published: (2026)
U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
by: Tian, Yuchuan, et al.
Published: (2024)
by: Tian, Yuchuan, et al.
Published: (2024)
Are the Risk Factors for Developing Type 1 and Type 2 Cesarean Scar Pregnancy Different?
by: Yunhui Tang, et al.
Published: (2025)
by: Yunhui Tang, et al.
Published: (2025)
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers
by: Li, Bozhou, et al.
Published: (2026)
by: Li, Bozhou, et al.
Published: (2026)
The dynamic of the positons for the reverse space-time nonlocal short pulse equation
by: Shan, Jiaqing, et al.
Published: (2024)
by: Shan, Jiaqing, et al.
Published: (2024)
Rethinking Dimensional Rationale in Graph Contrastive Learning from Causal Perspective
by: Ji, Qirui, et al.
Published: (2023)
by: Ji, Qirui, et al.
Published: (2023)
Rethinking Retrieval-Augmentation as Synthesis: A Query-Aware Context Merging Approach
by: Guo, Jiarui, et al.
Published: (2026)
by: Guo, Jiarui, et al.
Published: (2026)
Deferred Commitment Decoding for Diffusion Language Models
by: Shu, Yingte, et al.
Published: (2026)
by: Shu, Yingte, et al.
Published: (2026)
StarPose: 3D Human Pose Estimation via Spatial-Temporal Autoregressive Diffusion
by: Yang, Haoxin, et al.
Published: (2025)
by: Yang, Haoxin, et al.
Published: (2025)
MaskSR: Masked Language Model for Full-band Speech Restoration
by: Li, Xu, et al.
Published: (2024)
by: Li, Xu, et al.
Published: (2024)
A Proof of the Biquadratic Linear AFL for GL(4)
by: Li, Qirui
Published: (2025)
by: Li, Qirui
Published: (2025)
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
by: Xu, Yichu, et al.
Published: (2024)
by: Xu, Yichu, et al.
Published: (2024)
Masked Latent Transformer with the Random Masking Ratio to Advance the Diagnosis of Dental Fluorosis
by: Wu, Yun, et al.
Published: (2024)
by: Wu, Yun, et al.
Published: (2024)
Precipitation Nowcasting Using Diffusion Transformer with Causal Attention
by: Li, ChaoRong, et al.
Published: (2024)
by: Li, ChaoRong, et al.
Published: (2024)
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
by: Lv, Zhengyao, et al.
Published: (2025)
by: Lv, Zhengyao, et al.
Published: (2025)
Rethinking Video Deblurring with Wavelet-Aware Dynamic Transformer and Diffusion Model
by: Rao, Chen, et al.
Published: (2024)
by: Rao, Chen, et al.
Published: (2024)
Dynamic Routing in Space-Ground Integrated Quantum Networks
by: Hu, Tianjie, et al.
Published: (2025)
by: Hu, Tianjie, et al.
Published: (2025)
Rethinking Tokenized Graph Transformers for Node Classification
by: Chen, Jinsong, et al.
Published: (2025)
by: Chen, Jinsong, et al.
Published: (2025)
Advancements in Molecular Diagnosis and Pharmacotherapeutic Strategies for Invasive Pituitary Adenomas
by: Dingkai Xu, et al.
Published: (2024)
by: Dingkai Xu, et al.
Published: (2024)
Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications
by: Bu, Tong, et al.
Published: (2024)
by: Bu, Tong, et al.
Published: (2024)
Soliton,breathers,positons and rogue waves for the vector complex modified Korteweg-de Vries equation
by: Liu, Yihang, et al.
Published: (2025)
by: Liu, Yihang, et al.
Published: (2025)
EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing
by: Sun, Haotian, et al.
Published: (2024)
by: Sun, Haotian, et al.
Published: (2024)
Teleporter Theory: A General and Simple Approach for Modeling Cross-World Counterfactual Causality
by: Li, Jiangmeng, et al.
Published: (2024)
by: Li, Jiangmeng, et al.
Published: (2024)
DiffusionUavLoc: Visually Prompted Diffusion for Cross-View UAV Localization
by: Liu, Tao, et al.
Published: (2025)
by: Liu, Tao, et al.
Published: (2025)
Post-Hopf algebras, relative Rota-Baxter operators and solutions of the Yang-Baxter equation
by: Li, Yunnan, et al.
Published: (2022)
by: Li, Yunnan, et al.
Published: (2022)
Relative Rota-Baxter operators of weight 0 on groups, pre-groups, braces, the Yang-Baxter equation and $T$-structures
by: Li, Yunnan, et al.
Published: (2023)
by: Li, Yunnan, et al.
Published: (2023)
U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025)
by: Tian, Yuchuan, et al.
Published: (2025)
Runtime Burden Allocation for Structured LLM Routing in Agentic Expert Systems: A Full-Factorial Cross-Backend Methodology
by: Hanlin, Zhou, et al.
Published: (2026)
by: Hanlin, Zhou, et al.
Published: (2026)
DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion
by: Zhang, Hanlin, et al.
Published: (2026)
by: Zhang, Hanlin, et al.
Published: (2026)
JPDS-NN: Reinforcement Learning-Based Dynamic Task Allocation for Agricultural Vehicle Routing Optimization
by: Fan, Yixuan, et al.
Published: (2025)
by: Fan, Yixuan, et al.
Published: (2025)
Rethinking the Potential of Layer Freezing for Efficient DNN Training
by: Yang, Chence, et al.
Published: (2025)
by: Yang, Chence, et al.
Published: (2025)
Similar Items
-
Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps
by: Zhou, Yanke, et al.
Published: (2026) -
RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models
by: Cong, Xing, et al.
Published: (2026) -
Linear-DPO: Linear Direct Preference Optimization for Diffusion and Flow-Matching Generative Models
by: Li, Kesong, et al.
Published: (2026) -
MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers
by: Ding, Ning, et al.
Published: (2024) -
Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context
by: Shen, Cuifeng, et al.
Published: (2025)