:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Zhiwei, Jin, Shibo, Liu, Lingjie, Zhao, Mingmin
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.03799
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ScaleMoGen: Autoregressive Next-Scale Prediction for Human Motion Generation
by: Hwang, Inwoo, et al.
Published: (2026)

DressCode: Autoregressively Sewing and Generating Garments from Text Guidance
by: He, Kai, et al.
Published: (2024)

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale
by: NextStep Team, et al.
Published: (2025)

Next Patch Prediction for Autoregressive Visual Generation
by: Pang, Yatian, et al.
Published: (2024)

Causal Motion Diffusion Models for Autoregressive Motion Generation
by: Yu, Qing, et al.
Published: (2026)

MoSa: Motion Generation with Scalable Autoregressive Modeling
by: Liu, Mengyuan, et al.
Published: (2025)

Next-Scale Autoregressive Models are Zero-Shot Single-Image Object View Synthesizers
by: Yuan, Shiran, et al.
Published: (2025)

NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-Resolution
by: Kong, Xiangtao, et al.
Published: (2025)

GenAR: Next-Scale Autoregressive Generation for Spatial Gene Expression Prediction
by: Ouyang, Jiarui, et al.
Published: (2025)

LLaMo: Scaling Pretrained Language Models for Unified Motion Understanding and Generation with Continuous Autoregressive Tokens
by: Li, Zekun, et al.
Published: (2026)

Long-Context Autoregressive Video Modeling with Next-Frame Prediction
by: Gu, Yuchao, et al.
Published: (2025)

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
by: Tian, Keyu, et al.
Published: (2024)

DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
by: Zhao, Kaifeng, et al.
Published: (2024)

DIMO: Diverse 3D Motion Generation for Arbitrary Objects
by: Mou, Linzhan, et al.
Published: (2025)

ScaMo: Exploring the Scaling Law in Autoregressive Motion Generation Model
by: Lu, Shunlin, et al.
Published: (2024)

Motion-Aware Caching for Efficient Autoregressive Video Generation
by: Xu, Jing, et al.
Published: (2026)

DiverseVAR: Balancing Diversity and Quality of Next-Scale Visual Autoregressive Models
by: Park, Mingue, et al.
Published: (2025)

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction
by: Ji, Longbin, et al.
Published: (2026)

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
by: Ren, Sucheng, et al.
Published: (2025)

Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
by: Jin, Peng, et al.
Published: (2024)

HINT: Hierarchical Interaction Modeling for Autoregressive Multi-Human Motion Generation
by: Liu, Mengge, et al.
Published: (2026)

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
by: Fan, Lijie, et al.
Published: (2024)

MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction
by: He, Zhicheng, et al.
Published: (2026)

Autoregressive Video Generation beyond Next Frames Prediction
by: Ren, Sucheng, et al.
Published: (2025)

Efficient Conditional Generation on Scale-based Visual Autoregressive Models
by: Liu, Jiaqi, et al.
Published: (2025)

DCoAR: Deep Concept Injection into Unified Autoregressive Models for Personalized Text-to-Image Generation
by: Wu, Fangtai, et al.
Published: (2025)

Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing
by: Dao, Quan, et al.
Published: (2025)

ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models
by: Zhao, Qinyu, et al.
Published: (2025)

TriC-Motion: Tri-Domain Causal Modeling Grounded Text-to-Motion Generation
by: Cao, Yiyang, et al.
Published: (2026)

OmniMotion: Multimodal Motion Generation with Continuous Masked Autoregression
by: Li, Zhe, et al.
Published: (2025)

AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving
by: Jia, Xiaosong, et al.
Published: (2024)

Rethinking Diffusion for Text-Driven Human Motion Generation: Redundant Representations, Evaluation, and Masked Autoregression
by: Meng, Zichong, et al.
Published: (2024)

BAMM: Bidirectional Autoregressive Motion Model
by: Pinyoanuntapong, Ekkasit, et al.
Published: (2024)

PointNSP: Autoregressive 3D Point Cloud Generation with Next-Scale Level-of-Detail Prediction
by: Meng, Ziqiao, et al.
Published: (2025)

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
by: Wang, Yufu, et al.
Published: (2024)

Auto DragGAN: Editing the Generative Image Manifold in an Autoregressive Manner
by: Cai, Pengxiang, et al.
Published: (2024)

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations
by: Li, Jinghan, et al.
Published: (2025)

NEP: Autoregressive Image Editing via Next Editing Token Prediction
by: Wu, Huimin, et al.
Published: (2025)

LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling
by: Zheng, Hong-Kai, et al.
Published: (2025)

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
by: Huang, Yiming, et al.
Published: (2024)