Saved in:
| Main Authors: | Chen, Jingye, Zhao, Yuzhong, Huang, Yupan, Cui, Lei, Dong, Li, Lv, Tengchao, Chen, Qifeng, Wei, Furu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.21172 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KOSMOS-2.5: A Multimodal Literate Model
by: Lv, Tengchao, et al.
Published: (2023)
by: Lv, Tengchao, et al.
Published: (2023)
DocReward: A Document Reward Model for Structuring and Stylizing
by: Liu, Junpeng, et al.
Published: (2025)
by: Liu, Junpeng, et al.
Published: (2025)
TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
by: Pham, Kien T., et al.
Published: (2024)
by: Pham, Kien T., et al.
Published: (2024)
Rethinking Layered Graphic Design Generation with a Top-Down Approach
by: Chen, Jingye, et al.
Published: (2025)
by: Chen, Jingye, et al.
Published: (2025)
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
by: Huang, Yangyu, et al.
Published: (2025)
by: Huang, Yangyu, et al.
Published: (2025)
Does Synthetic Layered Design Data Benefit Layered Design Decomposition?
by: Wu, Kam Man, et al.
Published: (2026)
by: Wu, Kam Man, et al.
Published: (2026)
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
by: Pan, Xichen, et al.
Published: (2023)
by: Pan, Xichen, et al.
Published: (2023)
Large Motion Video Autoencoding with Cross-modal Video VAE
by: Xing, Yazhou, et al.
Published: (2024)
by: Xing, Yazhou, et al.
Published: (2024)
Towards Generalist Game Players: An Investigation of Foundation Models in the Game Multiverse
by: Zhang, Kuan, et al.
Published: (2026)
by: Zhang, Kuan, et al.
Published: (2026)
Hunyuan-Game: Industrial-grade Intelligent Game Creation Model
by: Li, Ruihuang, et al.
Published: (2025)
by: Li, Ruihuang, et al.
Published: (2025)
BEV-VAE: Multi-view Image Generation with Spatial Consistency for Autonomous Driving
by: Chen, Zeming, et al.
Published: (2025)
by: Chen, Zeming, et al.
Published: (2025)
From Virtual Games to Real-World Play
by: Sun, Wenqiang, et al.
Published: (2025)
by: Sun, Wenqiang, et al.
Published: (2025)
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation
by: Wu, Xun, et al.
Published: (2024)
by: Wu, Xun, et al.
Published: (2024)
Optimizing Prompts for Text-to-Image Generation
by: Hao, Yaru, et al.
Published: (2022)
by: Hao, Yaru, et al.
Published: (2022)
Play to Generalize: Learning to Reason Through Game Play
by: Xie, Yunfei, et al.
Published: (2025)
by: Xie, Yunfei, et al.
Published: (2025)
Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model
by: Tang, Junshu, et al.
Published: (2025)
by: Tang, Junshu, et al.
Published: (2025)
GameGen-X: Interactive Open-world Game Video Generation
by: Che, Haoxuan, et al.
Published: (2024)
by: Che, Haoxuan, et al.
Published: (2024)
AvatarArtist: Open-Domain 4D Avatarization
by: Liu, Hongyu, et al.
Published: (2025)
by: Liu, Hongyu, et al.
Published: (2025)
GameFactory: Creating New Games with Generative Interactive Videos
by: Yu, Jiwen, et al.
Published: (2025)
by: Yu, Jiwen, et al.
Published: (2025)
CharaConsist: Fine-Grained Consistent Character Generation
by: Wang, Mengyu, et al.
Published: (2025)
by: Wang, Mengyu, et al.
Published: (2025)
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
by: Liu, Tianqi, et al.
Published: (2025)
by: Liu, Tianqi, et al.
Published: (2025)
TIGaussian: Disentangle Gaussians for Spatial-Awared Text-Image-3D Alignment
by: Liu, Jiarun, et al.
Published: (2026)
by: Liu, Jiarun, et al.
Published: (2026)
Domain Game: Disentangle Anatomical Feature for Single Domain Generalized Segmentation
by: Chen, Hao, et al.
Published: (2024)
by: Chen, Hao, et al.
Published: (2024)
Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View
by: Wang, Jin, et al.
Published: (2024)
by: Wang, Jin, et al.
Published: (2024)
FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation
by: Mu, Xinzhi, et al.
Published: (2024)
by: Mu, Xinzhi, et al.
Published: (2024)
LibraGen: Playing a Balance Game in Subject-Driven Video Generation
by: Zhu, Jiahao, et al.
Published: (2026)
by: Zhu, Jiahao, et al.
Published: (2026)
UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos
by: Huang, Yuzhong, et al.
Published: (2024)
by: Huang, Yuzhong, et al.
Published: (2024)
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition
by: Li, Jiaqi, et al.
Published: (2025)
by: Li, Jiaqi, et al.
Published: (2025)
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency
by: Yin, Yuyang, et al.
Published: (2023)
by: Yin, Yuyang, et al.
Published: (2023)
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
by: Yan, Zexuan, et al.
Published: (2025)
by: Yan, Zexuan, et al.
Published: (2025)
Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking
by: Wu, Shengqiong, et al.
Published: (2026)
by: Wu, Shengqiong, et al.
Published: (2026)
OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control
by: Huang, Yuzhong, et al.
Published: (2024)
by: Huang, Yuzhong, et al.
Published: (2024)
Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation
by: Chen, Wei, et al.
Published: (2026)
by: Chen, Wei, et al.
Published: (2026)
GTAutoAct: An Automatic Datasets Generation Framework Based on Game Engine Redevelopment for Action Recognition
by: Song, Xingyu, et al.
Published: (2024)
by: Song, Xingyu, et al.
Published: (2024)
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
by: Huang, Yupan, et al.
Published: (2023)
by: Huang, Yupan, et al.
Published: (2023)
Hierarchical Fine-grained Preference Optimization for Physically Plausible Video Generation
by: Chen, Harold Haodong, et al.
Published: (2025)
by: Chen, Harold Haodong, et al.
Published: (2025)
GameTileNet: A Semantic Dataset for Low-Resolution Game Art in Procedural Content Generation
by: Chen, Yi-Chun, et al.
Published: (2025)
by: Chen, Yi-Chun, et al.
Published: (2025)
PrivImage: Differentially Private Synthetic Image Generation using Diffusion Models with Semantic-Aware Pretraining
by: Li, Kecen, et al.
Published: (2023)
by: Li, Kecen, et al.
Published: (2023)
GUI Agents for Continual Game Generation
by: Huang, Yixu, et al.
Published: (2026)
by: Huang, Yixu, et al.
Published: (2026)
GameIR: A Large-Scale Synthesized Ground-Truth Dataset for Image Restoration over Gaming Content
by: Zhou, Lebin, et al.
Published: (2024)
by: Zhou, Lebin, et al.
Published: (2024)
Similar Items
-
KOSMOS-2.5: A Multimodal Literate Model
by: Lv, Tengchao, et al.
Published: (2023) -
DocReward: A Document Reward Model for Structuring and Stylizing
by: Liu, Junpeng, et al.
Published: (2025) -
TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization
by: Pham, Kien T., et al.
Published: (2024) -
Rethinking Layered Graphic Design Generation with a Top-Down Approach
by: Chen, Jingye, et al.
Published: (2025) -
PEACE: Empowering Geologic Map Holistic Understanding with MLLMs
by: Huang, Yangyu, et al.
Published: (2025)