:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Feng, Haoran, Niu, Yifan, Huang, Zehuan, Sun, Yang-Tian, Guo, Chunchao, Peng, Yuxin, Sheng, Lu
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2604.16299
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SegviGen: Repurposing 3D Generative Model for Part Segmentation
by: Li, Lin, et al.
Published: (2026)

VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
by: Li, Lin, et al.
Published: (2025)

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models
by: Huang, Zehuan, et al.
Published: (2025)

Stereo World Model: Camera-Guided Stereo Video Generation
by: Sun, Yang-Tian, et al.
Published: (2026)

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
by: Wen, Hao, et al.
Published: (2024)

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
by: Tian, Keyu, et al.
Published: (2024)

PhysForge: Generating Physics-Grounded 3D Assets for Interactive Virtual World
by: Yang, Yunhan, et al.
Published: (2026)

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
by: Sun, Peize, et al.
Published: (2024)

MoCA: Mixture-of-Components Attention for Scalable Compositional 3D Generation
by: Li, Zhiqi, et al.
Published: (2025)

Personalize Anything for Free with Diffusion Transformer
by: Feng, Haoran, et al.
Published: (2025)

MV-Adapter: Multi-view Consistent Image Generation Made Easy
by: Huang, Zehuan, et al.
Published: (2024)

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
by: Huang, Zehuan, et al.
Published: (2024)

HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving
by: Wu, Zehuan, et al.
Published: (2024)

InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE
by: Wang, Lipeng, et al.
Published: (2025)

Controllable Generation of Large-Scale 3D Urban Layouts with Semantic and Structural Guidance
by: Niu, Mengyuan, et al.
Published: (2025)

QuadGPT: Native Quadrilateral Mesh Generation with Autoregressive Models
by: Liu, Jian, et al.
Published: (2025)

MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
by: Huang, Zehuan, et al.
Published: (2024)

ArtLLM: Generating Articulated Assets via 3D LLM
by: Wang, Penghao, et al.
Published: (2026)

ROAR-3D: Routing Arbitrary Views for High-Fidelity 3D Generation
by: Sun, Hanxiao, et al.
Published: (2026)

Toward Scene Graph and Layout Guided Complex 3D Scene Generation
by: Huang, Yu-Hsiang, et al.
Published: (2024)

Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation
by: Xiang, Tiange, et al.
Published: (2025)

Layout2Scene: 3D Semantic Layout Guided Scene Generation via Geometry and Appearance Diffusion Priors
by: Chen, Minglin, et al.
Published: (2025)

LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model
by: Yang, Yixuan, et al.
Published: (2024)

3D Generation for Embodied AI and Robotic Simulation: A Survey
by: Ye, Tianwei, et al.
Published: (2026)

TELA: Text to Layer-wise 3D Clothed Human Generation
by: Dong, Junting, et al.
Published: (2024)

Pathwise Test-Time Correction for Autoregressive Long Video Generation
by: Xiang, Xunzhi, et al.
Published: (2026)

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint
by: Zhou, Junwei, et al.
Published: (2024)

Scan-and-Print: Patch-level Data Summarization and Augmentation for Content-aware Layout Generation in Poster Design
by: Hsu, HsiaoYuan, et al.
Published: (2025)

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation
by: Liu, Yi, et al.
Published: (2025)

FlashWorld: High-quality 3D Scene Generation within Seconds
by: Li, Xinyang, et al.
Published: (2025)

SDesc3D: Towards Layout-Aware 3D Indoor Scene Generation from Short Descriptions
by: Feng, Jie, et al.
Published: (2026)

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
by: Ni, Jingcheng, et al.
Published: (2025)

LATTICE: Democratize High-Fidelity 3D Generation at Scale
by: Lai, Zeqiang, et al.
Published: (2025)

Layout-Conditioned Autoregressive Text-to-Image Generation via Structured Masking
by: Zheng, Zirui, et al.
Published: (2025)

PoseMaster: A Unified 3D Native Framework for Stylized Pose Generation
by: Yan, Hongyu, et al.
Published: (2025)

LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer
by: Li, Yu, et al.
Published: (2024)

DiffX: Guide Your Layout to Cross-Modal Generative Modeling
by: Wang, Zeyu, et al.
Published: (2024)

SceneCraft: Layout-Guided 3D Scene Generation
by: Yang, Xiuyu, et al.
Published: (2024)

Parallelized Autoregressive Visual Generation
by: Wang, Yuqing, et al.
Published: (2024)

RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis
by: Feng, Yifei, et al.
Published: (2025)