Saved in:
| Main Authors: | Cheng, Tianle, Zhang, Zeyan, Gao, Kaifeng, Xiao, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.12099 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2024)
by: Gao, Kaifeng, et al.
Published: (2024)
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024)
by: Gao, Kaifeng, et al.
Published: (2024)
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
by: Xiong, Tianwei, et al.
Published: (2026)
by: Xiong, Tianwei, et al.
Published: (2026)
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
by: Li, Yizhuo, et al.
Published: (2024)
by: Li, Yizhuo, et al.
Published: (2024)
VideoMAR: Autoregressive Video Generatio with Continuous Tokens
by: Yu, Hu, et al.
Published: (2025)
by: Yu, Hu, et al.
Published: (2025)
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
by: Yin, Tianwei, et al.
Published: (2024)
by: Yin, Tianwei, et al.
Published: (2024)
Generalized Visual Relation Detection with Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2025)
by: Gao, Kaifeng, et al.
Published: (2025)
Progressive Autoregressive Video Diffusion Models
by: Xie, Desai, et al.
Published: (2024)
by: Xie, Desai, et al.
Published: (2024)
Q-ARVD: Quantizing Autoregressive Video Diffusion Models
by: Tang, Siao, et al.
Published: (2026)
by: Tang, Siao, et al.
Published: (2026)
Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
by: Xiao, Steven, et al.
Published: (2025)
by: Xiao, Steven, et al.
Published: (2025)
Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models
by: Ji, Yicheng, et al.
Published: (2026)
by: Ji, Yicheng, et al.
Published: (2026)
Real-Time Motion-Controllable Autoregressive Video Diffusion
by: Zhao, Kesen, et al.
Published: (2025)
by: Zhao, Kesen, et al.
Published: (2025)
Efficient Autoregressive Video Diffusion with Dummy Head
by: Guo, Hang, et al.
Published: (2026)
by: Guo, Hang, et al.
Published: (2026)
Infinite Gaze Generation for Videos with Autoregressive Diffusion
by: Kang, Jenna, et al.
Published: (2026)
by: Kang, Jenna, et al.
Published: (2026)
FastSTAR: Spatiotemporal Token Pruning for Efficient Autoregressive Video Synthesis
by: Yune, Sungwoong, et al.
Published: (2026)
by: Yune, Sungwoong, et al.
Published: (2026)
ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
by: Wang, Lingfeng, et al.
Published: (2025)
by: Wang, Lingfeng, et al.
Published: (2025)
DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
by: Zhao, Kaifeng, et al.
Published: (2024)
by: Zhao, Kaifeng, et al.
Published: (2024)
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
by: Li, Zongyi, et al.
Published: (2024)
by: Li, Zongyi, et al.
Published: (2024)
TokenTrim: Inference-Time Token Pruning for Autoregressive Long Video Generation
by: Shaulov, Ariel, et al.
Published: (2026)
by: Shaulov, Ariel, et al.
Published: (2026)
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)
by: Zhao, Min, et al.
Published: (2026)
LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
by: Wang, Hanyu, et al.
Published: (2024)
by: Wang, Hanyu, et al.
Published: (2024)
Autoregressive Universal Video Segmentation Model
by: Heo, Miran, et al.
Published: (2025)
by: Heo, Miran, et al.
Published: (2025)
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
by: Liu, Yunze, et al.
Published: (2025)
by: Liu, Yunze, et al.
Published: (2025)
Fractal Autoregressive Depth Estimation with Continuous Token Diffusion
by: Zhang, Jinchang, et al.
Published: (2026)
by: Zhang, Jinchang, et al.
Published: (2026)
Past- and Future-Informed KV Cache Policy with Salience Estimation in Autoregressive Video Diffusion
by: Chen, Hanmo, et al.
Published: (2026)
by: Chen, Hanmo, et al.
Published: (2026)
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
by: Wang, Bohan, et al.
Published: (2025)
by: Wang, Bohan, et al.
Published: (2025)
Adaptive 1D Video Diffusion Autoencoder
by: Teng, Yao, et al.
Published: (2026)
by: Teng, Yao, et al.
Published: (2026)
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)
by: Liu, Kunhao, et al.
Published: (2025)
LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
by: Song, Wenhui, et al.
Published: (2025)
by: Song, Wenhui, et al.
Published: (2025)
SuperVoxelGPT: Adaptive and Ordered 3D Tokenization for Autoregressive Shape Generation
by: Li, Yuan, et al.
Published: (2026)
by: Li, Yuan, et al.
Published: (2026)
DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
by: Lyu, Hengye, et al.
Published: (2026)
by: Lyu, Hengye, et al.
Published: (2026)
Pathwise Test-Time Correction for Autoregressive Long Video Generation
by: Xiang, Xunzhi, et al.
Published: (2026)
by: Xiang, Xunzhi, et al.
Published: (2026)
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)
by: Liu, Haozhe, et al.
Published: (2024)
Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization
by: Deng, Kangle, et al.
Published: (2025)
by: Deng, Kangle, et al.
Published: (2025)
Astraea: A Token-wise Acceleration Framework for Video Diffusion Transformers
by: Liu, Haosong, et al.
Published: (2025)
by: Liu, Haosong, et al.
Published: (2025)
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling
by: Guo, Yuwei, et al.
Published: (2025)
by: Guo, Yuwei, et al.
Published: (2025)
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)
by: Lv, Chengtao, et al.
Published: (2026)
Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
by: Chao, Brian, et al.
Published: (2026)
by: Chao, Brian, et al.
Published: (2026)
Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
by: Zhang, Wentao, et al.
Published: (2024)
by: Zhang, Wentao, et al.
Published: (2024)
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
by: Ge, Yuying, et al.
Published: (2024)
by: Ge, Yuying, et al.
Published: (2024)
Similar Items
-
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2024) -
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024) -
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
by: Xiong, Tianwei, et al.
Published: (2026) -
DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
by: Li, Yizhuo, et al.
Published: (2024) -
VideoMAR: Autoregressive Video Generatio with Continuous Tokens
by: Yu, Hu, et al.
Published: (2025)