:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhao, Wangbo, Han, Yizeng, Tang, Zhiwei, Tang, Jiasheng, Zhou, Pengfei, Wang, Kai, Zhuang, Bohan, Wang, Zhangyang, Wang, Fan, You, Yang
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.22323
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dynamic Diffusion Transformer
by: Zhao, Wangbo, et al.
Published: (2024)

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
by: Zhao, Wangbo, et al.
Published: (2024)

DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
by: Zhao, Wangbo, et al.
Published: (2025)

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
by: Zhao, Wangbo, et al.
Published: (2024)

Few-Step Distillation for Text-to-Image Generation: A Practical Guide
by: Pu, Yifan, et al.
Published: (2025)

BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation
by: Zhang, Zeyu, et al.
Published: (2025)

Recurrent Diffusion for Large-Scale Parameter Generation
by: Wang, Kai, et al.
Published: (2025)

Accelerating Parallel Sampling of Diffusion Models
by: Tang, Zhiwei, et al.
Published: (2024)

REPA Works Until It Doesn't: Early-Stopped, Holistic Alignment Supercharges Diffusion Training
by: Wang, Ziqiao, et al.
Published: (2025)

SparseDiT: Token Sparsification for Efficient Diffusion Transformer
by: Chang, Shuning, et al.
Published: (2024)

Conditional LoRA Parameter Generation
by: Jin, Xiaolong, et al.
Published: (2024)

FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion
by: Liu, Akide, et al.
Published: (2025)

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization
by: Tang, Zhiwei, et al.
Published: (2024)

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
by: Inferix Team, et al.
Published: (2025)

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
by: Liang, Zhiyuan, et al.
Published: (2025)

HumanNOVA: Photorealistic, Universal and Rapid 3D Human Avatar Modeling from a Single Image
by: Hu, Hezhen, et al.
Published: (2026)

EA-ViT: Efficient Adaptation for Elastic Vision Transformer
by: Zhu, Chen, et al.
Published: (2025)

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution
by: Wang, Chengcheng, et al.
Published: (2024)

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models
by: Xu, Zhaopan, et al.
Published: (2025)

ModaVerse: Efficiently Transforming Modalities with LLMs
by: Wang, Xinyu, et al.
Published: (2024)

TS-DP: Reinforcement Speculative Decoding For Temporal Adaptive Diffusion Policy Acceleration
by: Li, Ye, et al.
Published: (2025)

UDiTQC: U-Net-Style Diffusion Transformer for Quantum Circuit Synthesis
by: Chen, Zhiwei, et al.
Published: (2025)

RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models
by: Jiang, Tanqiu, et al.
Published: (2025)

Tri-Level Navigator: LLM-Empowered Tri-Level Learning for Time Series OOD Generalization
by: Jian, Chengtao, et al.
Published: (2024)

FlashAR: Efficient Post-Training Acceleration for Autoregressive Image Generation
by: Zhou, Junkang, et al.
Published: (2026)

Data Efficient Any Transformer-to-Mamba Distillation via Attention Bridge
by: Wang, Penghao, et al.
Published: (2025)

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model
by: Xu, Xingqian, et al.
Published: (2022)

Neural Network Diffusion
by: Wang, Kai, et al.
Published: (2024)

Position: Weight Space Should Be a First-Class Generative AI Modality
by: Wang, Zhangyang, et al.
Published: (2026)

SoftCap: Soft-Budget Control for Diffusion Transformer Acceleration
by: Zhang, Yuhang, et al.
Published: (2026)

NFPO: Stabilized Policy Optimization of Normalizing Flow for Robotic Policy Learning
by: Shi, Diyuan, et al.
Published: (2026)

PFDM: Parser-Free Virtual Try-on via Diffusion Model
by: Niu, Yunfang, et al.
Published: (2024)

CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model
by: Zeng, Jianhao, et al.
Published: (2023)

JCo-MVTON: Jointly Controllable Multi-Modal Diffusion Transformer for Mask-Free Virtual Try-on
by: Wang, Aowen, et al.
Published: (2025)

MoNE: Replacing Redundant Experts with Lightweight Novices for Structured Pruning of MoE
by: Zhang, Geng, et al.
Published: (2025)

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification
by: Xu, Zhaopan, et al.
Published: (2025)

Enhance-A-Video: Better Generated Video for Free
by: Luo, Yang, et al.
Published: (2025)

Towards 3D-Aware Video Diffusion Models: Render-Free Human Motion Control with Mesh Tokenization
by: Liang, Jingyun, et al.
Published: (2026)

Efficient Online Reinforcement Learning for Diffusion Policy
by: Ma, Haitong, et al.
Published: (2025)

Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering
by: Wang, Sixian, et al.
Published: (2025)