Saved in:
| Main Authors: | Chen, Jintao, Bai, Chengyu, Hu, Junjun, Xue, Xinda, Xu, Mu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.06939 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AstraNav-World: World Model for Foresight Control and Consistency
by: Chen, Jintao, et al.
Published: (2025)
by: Chen, Jintao, et al.
Published: (2025)
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)
by: Liu, Kunhao, et al.
Published: (2025)
Speculative Decoding for Autoregressive Video Generation
by: Hu, Yuezhou, et al.
Published: (2026)
by: Hu, Yuezhou, et al.
Published: (2026)
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
by: Hu, Panwen, et al.
Published: (2025)
by: Hu, Panwen, et al.
Published: (2025)
Reward-Forcing: Autoregressive Video Generation with Reward Feedback
by: Zhang, Jingran, et al.
Published: (2026)
by: Zhang, Jingran, et al.
Published: (2026)
Context Forcing: Consistent Autoregressive Video Generation with Long Context
by: Chen, Shuo, et al.
Published: (2026)
by: Chen, Shuo, et al.
Published: (2026)
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
by: Huang, Xun, et al.
Published: (2025)
by: Huang, Xun, et al.
Published: (2025)
UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
by: Bai, Chengyu, et al.
Published: (2025)
by: Bai, Chengyu, et al.
Published: (2025)
Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
by: Xu, Peiran, et al.
Published: (2025)
by: Xu, Peiran, et al.
Published: (2025)
Taming Teacher Forcing for Masked Autoregressive Video Generation
by: Zhou, Deyu, et al.
Published: (2025)
by: Zhou, Deyu, et al.
Published: (2025)
Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation
by: Luo, Jiayi, et al.
Published: (2026)
by: Luo, Jiayi, et al.
Published: (2026)
SGANet: Semantic and Geometric Alignment for Multimodal Multi-view Anomaly Detection
by: Bai, Letian, et al.
Published: (2026)
by: Bai, Letian, et al.
Published: (2026)
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025)
by: Bai, Chengyu, et al.
Published: (2025)
Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
by: Li, Daxin, et al.
Published: (2025)
by: Li, Daxin, et al.
Published: (2025)
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)
by: Zhao, Min, et al.
Published: (2026)
Semantic Context Matters: Improving Conditioning for Autoregressive Models
by: Jin, Dongyang, et al.
Published: (2025)
by: Jin, Dongyang, et al.
Published: (2025)
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)
by: Lv, Chengtao, et al.
Published: (2026)
Bridging Time and Space: Decoupled Spatio-Temporal Alignment for Video Grounding
by: Tu, Xuezhen, et al.
Published: (2026)
by: Tu, Xuezhen, et al.
Published: (2026)
Delta Forcing: Trust Region Steering for Interactive Autoregressive Video Generation
by: Wu, Yuheng, et al.
Published: (2026)
by: Wu, Yuheng, et al.
Published: (2026)
Controllable Longer Image Animation with Diffusion Models
by: Wang, Qiang, et al.
Published: (2024)
by: Wang, Qiang, et al.
Published: (2024)
TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
by: Yang, Zuhao, et al.
Published: (2025)
by: Yang, Zuhao, et al.
Published: (2025)
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
by: Li, Jinghan, et al.
Published: (2025)
by: Li, Jinghan, et al.
Published: (2025)
SnAG: Scalable and Accurate Video Grounding
by: Mu, Fangzhou, et al.
Published: (2024)
by: Mu, Fangzhou, et al.
Published: (2024)
Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models
by: Ji, Yicheng, et al.
Published: (2026)
by: Ji, Yicheng, et al.
Published: (2026)
Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics
by: Xu, Tianshuo, et al.
Published: (2026)
by: Xu, Tianshuo, et al.
Published: (2026)
Encapsulated Composition of Text-to-Image and Text-to-Video Models for High-Quality Video Synthesis
by: Su, Tongtong, et al.
Published: (2025)
by: Su, Tongtong, et al.
Published: (2025)
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
by: Zhang, Ruicheng, et al.
Published: (2026)
by: Zhang, Ruicheng, et al.
Published: (2026)
SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs
by: Tong, Jintao, et al.
Published: (2026)
by: Tong, Jintao, et al.
Published: (2026)
StreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and Compression
by: Chen, Yilong, et al.
Published: (2025)
by: Chen, Yilong, et al.
Published: (2025)
ConceptWeaver: Weaving Disentangled Concepts with Flow
by: Chen, Jintao, et al.
Published: (2026)
by: Chen, Jintao, et al.
Published: (2026)
One-Forcing: Towards Stable One-Step Autoregressive Video Generation
by: Feng, Jiaqi, et al.
Published: (2026)
by: Feng, Jiaqi, et al.
Published: (2026)
Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity
by: Tian, Jiahao, et al.
Published: (2026)
by: Tian, Jiahao, et al.
Published: (2026)
Real-Time Motion-Controllable Autoregressive Video Diffusion
by: Zhao, Kesen, et al.
Published: (2025)
by: Zhao, Kesen, et al.
Published: (2025)
EventRR: Event Referential Reasoning for Referring Video Object Segmentation
by: Xu, Huihui, et al.
Published: (2025)
by: Xu, Huihui, et al.
Published: (2025)
Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
by: Xiao, Steven, et al.
Published: (2025)
by: Xiao, Steven, et al.
Published: (2025)
Sparse Forcing: Native Trainable Sparse Attention for Real-time Autoregressive Diffusion Video Generation
by: Xu, Boxun, et al.
Published: (2026)
by: Xu, Boxun, et al.
Published: (2026)
FastSTAR: Spatiotemporal Token Pruning for Efficient Autoregressive Video Synthesis
by: Yune, Sungwoong, et al.
Published: (2026)
by: Yune, Sungwoong, et al.
Published: (2026)
Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning
by: Liu, Caihua, et al.
Published: (2025)
by: Liu, Caihua, et al.
Published: (2025)
Uncertainty-Aware Trajectory Prediction: A Unified Framework Harnessing Positional and Semantic Uncertainties
by: Sun, Jintao, et al.
Published: (2026)
by: Sun, Jintao, et al.
Published: (2026)
Pathwise Test-Time Correction for Autoregressive Long Video Generation
by: Xiang, Xunzhi, et al.
Published: (2026)
by: Xiang, Xunzhi, et al.
Published: (2026)
Similar Items
-
AstraNav-World: World Model for Foresight Control and Consistency
by: Chen, Jintao, et al.
Published: (2025) -
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025) -
Speculative Decoding for Autoregressive Video Generation
by: Hu, Yuezhou, et al.
Published: (2026) -
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
by: Hu, Panwen, et al.
Published: (2025) -
Reward-Forcing: Autoregressive Video Generation with Reward Feedback
by: Zhang, Jingran, et al.
Published: (2026)