Saved in:
Bibliographic Details
Main Authors: Feng, Ruili, Zhang, Han, Yang, Zhantao, Xiao, Jie, Shu, Zhilei, Liu, Zhiheng, Zheng, Andy, Huang, Yukun, Liu, Yu, Zhang, Hongyang
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.03568
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912143942615040
author Feng, Ruili
Zhang, Han
Yang, Zhantao
Xiao, Jie
Shu, Zhilei
Liu, Zhiheng
Zheng, Andy
Huang, Yukun
Liu, Yu
Zhang, Hongyang
author_facet Feng, Ruili
Zhang, Han
Yang, Zhantao
Xiao, Jie
Shu, Zhilei
Liu, Zhiheng
Zheng, Andy
Huang, Yukun
Liu, Yu
Zhang, Hongyang
contents We present The Matrix, the first foundational realistic world simulator capable of generating continuous 720p high-fidelity real-scene video streams with real-time, responsive control in both first- and third-person perspectives, enabling immersive exploration of richly dynamic environments. Trained on limited supervised data from AAA games like Forza Horizon 5 and Cyberpunk 2077, complemented by large-scale unsupervised footage from real-world settings like Tokyo streets, The Matrix allows users to traverse diverse terrains -- deserts, grasslands, water bodies, and urban landscapes -- in continuous, uncut hour-long sequences. Operating at 16 FPS, the system supports real-time interactivity and demonstrates zero-shot generalization, translating virtual game environments to real-world contexts where collecting continuous movement data is often infeasible. For example, The Matrix can simulate a BMW X3 driving through an office setting--an environment present in neither gaming data nor real-world sources. This approach showcases the potential of AAA game data to advance robust world models, bridging the gap between simulations and real-world applications in scenarios with limited data.
format Preprint
id arxiv_https___arxiv_org_abs_2412_03568
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
Feng, Ruili
Zhang, Han
Yang, Zhantao
Xiao, Jie
Shu, Zhilei
Liu, Zhiheng
Zheng, Andy
Huang, Yukun
Liu, Yu
Zhang, Hongyang
Artificial Intelligence
We present The Matrix, the first foundational realistic world simulator capable of generating continuous 720p high-fidelity real-scene video streams with real-time, responsive control in both first- and third-person perspectives, enabling immersive exploration of richly dynamic environments. Trained on limited supervised data from AAA games like Forza Horizon 5 and Cyberpunk 2077, complemented by large-scale unsupervised footage from real-world settings like Tokyo streets, The Matrix allows users to traverse diverse terrains -- deserts, grasslands, water bodies, and urban landscapes -- in continuous, uncut hour-long sequences. Operating at 16 FPS, the system supports real-time interactivity and demonstrates zero-shot generalization, translating virtual game environments to real-world contexts where collecting continuous movement data is often infeasible. For example, The Matrix can simulate a BMW X3 driving through an office setting--an environment present in neither gaming data nor real-world sources. This approach showcases the potential of AAA game data to advance robust world models, bridging the gap between simulations and real-world applications in scenarios with limited data.
title The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control
topic Artificial Intelligence
url https://arxiv.org/abs/2412.03568