:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cheng, Tianle, Zhang, Zeyan, Gao, Kaifeng, Xiao, Jun
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.12099
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2024)

Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024)

EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation
by: Xiong, Tianwei, et al.
Published: (2026)

DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models
by: Li, Yizhuo, et al.
Published: (2024)

VideoMAR: Autoregressive Video Generatio with Continuous Tokens
by: Yu, Hu, et al.
Published: (2025)

From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
by: Yin, Tianwei, et al.
Published: (2024)

Generalized Visual Relation Detection with Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2025)

Progressive Autoregressive Video Diffusion Models
by: Xie, Desai, et al.
Published: (2024)

Q-ARVD: Quantizing Autoregressive Video Diffusion Models
by: Tang, Siao, et al.
Published: (2026)

Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
by: Xiao, Steven, et al.
Published: (2025)

Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models
by: Ji, Yicheng, et al.
Published: (2026)

Real-Time Motion-Controllable Autoregressive Video Diffusion
by: Zhao, Kesen, et al.
Published: (2025)

Efficient Autoregressive Video Diffusion with Dummy Head
by: Guo, Hang, et al.
Published: (2026)

Infinite Gaze Generation for Videos with Autoregressive Diffusion
by: Kang, Jenna, et al.
Published: (2026)

FastSTAR: Spatiotemporal Token Pruning for Efficient Autoregressive Video Synthesis
by: Yune, Sungwoong, et al.
Published: (2026)

ALTo: Adaptive-Length Tokenizer for Autoregressive Mask Generation
by: Wang, Lingfeng, et al.
Published: (2025)

DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
by: Zhao, Kaifeng, et al.
Published: (2024)

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation
by: Li, Zongyi, et al.
Published: (2024)

TokenTrim: Inference-Time Token Pruning for Autoregressive Long Video Generation
by: Shaulov, Ariel, et al.
Published: (2026)

Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)

LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior
by: Wang, Hanyu, et al.
Published: (2024)

Autoregressive Universal Video Segmentation Model
by: Heo, Miran, et al.
Published: (2025)

VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
by: Liu, Yunze, et al.
Published: (2025)

Fractal Autoregressive Depth Estimation with Continuous Token Diffusion
by: Zhang, Jinchang, et al.
Published: (2026)

Past- and Future-Informed KV Cache Policy with Salience Estimation in Autoregressive Video Diffusion
by: Chen, Hanmo, et al.
Published: (2026)

Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
by: Wang, Bohan, et al.
Published: (2025)

Adaptive 1D Video Diffusion Autoencoder
by: Teng, Yao, et al.
Published: (2026)

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)

LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
by: Song, Wenhui, et al.
Published: (2025)

SuperVoxelGPT: Adaptive and Ordered 3D Tokenization for Autoregressive Shape Generation
by: Li, Yuan, et al.
Published: (2026)

DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
by: Lyu, Hengye, et al.
Published: (2026)

Pathwise Test-Time Correction for Autoregressive Long Video Generation
by: Xiang, Xunzhi, et al.
Published: (2026)

MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)

Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization
by: Deng, Kangle, et al.
Published: (2025)

Astraea: A Token-wise Acceleration Framework for Video Diffusion Transformers
by: Liu, Haosong, et al.
Published: (2025)

End-to-End Training for Autoregressive Video Diffusion via Self-Resampling
by: Guo, Yuwei, et al.
Published: (2025)

Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)

Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
by: Chao, Brian, et al.
Published: (2026)

Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
by: Zhang, Wentao, et al.
Published: (2024)

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
by: Ge, Yuying, et al.
Published: (2024)