:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Jintao, Bai, Chengyu, Hu, Junjun, Xue, Xinda, Xu, Mu
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.06939
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AstraNav-World: World Model for Foresight Control and Consistency
by: Chen, Jintao, et al.
Published: (2025)

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)

Speculative Decoding for Autoregressive Video Generation
by: Hu, Yuezhou, et al.
Published: (2026)

BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
by: Hu, Panwen, et al.
Published: (2025)

Reward-Forcing: Autoregressive Video Generation with Reward Feedback
by: Zhang, Jingran, et al.
Published: (2026)

Context Forcing: Consistent Autoregressive Video Generation with Long Context
by: Chen, Shuo, et al.
Published: (2026)

Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion
by: Huang, Xun, et al.
Published: (2025)

UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
by: Bai, Chengyu, et al.
Published: (2025)

Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors
by: Xu, Peiran, et al.
Published: (2025)

Taming Teacher Forcing for Masked Autoregressive Video Generation
by: Zhou, Deyu, et al.
Published: (2025)

Future Forcing: Future-aware Training-free KV Cache Policy for Autoregressive Video Generation
by: Luo, Jiayi, et al.
Published: (2026)

SGANet: Semantic and Geometric Alignment for Multimodal Multi-view Anomaly Detection
by: Bai, Letian, et al.
Published: (2026)

FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025)

Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation
by: Li, Daxin, et al.
Published: (2025)

Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)

Semantic Context Matters: Improving Conditioning for Autoregressive Models
by: Jin, Dongyang, et al.
Published: (2025)

Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)

Bridging Time and Space: Decoupled Spatio-Temporal Alignment for Video Grounding
by: Tu, Xuezhen, et al.
Published: (2026)

Delta Forcing: Trust Region Steering for Interactive Autoregressive Video Generation
by: Wu, Yuheng, et al.
Published: (2026)

Controllable Longer Image Animation with Diffusion Models
by: Wang, Qiang, et al.
Published: (2024)

TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
by: Yang, Zuhao, et al.
Published: (2025)

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
by: Li, Jinghan, et al.
Published: (2025)

SnAG: Scalable and Accurate Video Grounding
by: Mu, Fangzhou, et al.
Published: (2024)

Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models
by: Ji, Yicheng, et al.
Published: (2026)

Motion Forcing: A Decoupled Framework for Robust Video Generation in Motion Dynamics
by: Xu, Tianshuo, et al.
Published: (2026)

Encapsulated Composition of Text-to-Image and Text-to-Video Models for High-Quality Video Synthesis
by: Su, Tongtong, et al.
Published: (2025)

KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration
by: Zhang, Ruicheng, et al.
Published: (2026)

SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs
by: Tong, Jintao, et al.
Published: (2026)

StreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and Compression
by: Chen, Yilong, et al.
Published: (2025)

ConceptWeaver: Weaving Disentangled Concepts with Flow
by: Chen, Jintao, et al.
Published: (2026)

One-Forcing: Towards Stable One-Step Autoregressive Video Generation
by: Feng, Jiaqi, et al.
Published: (2026)

Head Forcing: Long Autoregressive Video Generation via Head Heterogeneity
by: Tian, Jiahao, et al.
Published: (2026)

Real-Time Motion-Controllable Autoregressive Video Diffusion
by: Zhao, Kesen, et al.
Published: (2025)

EventRR: Event Referential Reasoning for Referring Video Object Segmentation
by: Xu, Huihui, et al.
Published: (2025)

Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
by: Xiao, Steven, et al.
Published: (2025)

Sparse Forcing: Native Trainable Sparse Attention for Real-time Autoregressive Diffusion Video Generation
by: Xu, Boxun, et al.
Published: (2026)

FastSTAR: Spatiotemporal Token Pruning for Efficient Autoregressive Video Synthesis
by: Yune, Sungwoong, et al.
Published: (2026)

Capturing Rich Behavior Representations: A Dynamic Action Semantic-Aware Graph Transformer for Video Captioning
by: Liu, Caihua, et al.
Published: (2025)

Uncertainty-Aware Trajectory Prediction: A Unified Framework Harnessing Positional and Semantic Uncertainties
by: Sun, Jintao, et al.
Published: (2026)

Pathwise Test-Time Correction for Autoregressive Long Video Generation
by: Xiang, Xunzhi, et al.
Published: (2026)