Saved in:
| Main Authors: | Duan, Yinglin, Zou, Zhengxia, Gu, Tongwei, Jia, Wei, Zhao, Zhan, Xu, Luyi, Liu, Xinzhu, Lin, Yenan, Jiang, Hao, Chen, Kang, Qiu, Shuang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.05263 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WorldGPT: Empowering LLM as Multimodal World Model
by: Ge, Zhiqi, et al.
Published: (2024)
by: Ge, Zhiqi, et al.
Published: (2024)
MetaEarth3D: Unlocking World-scale 3D Generation with Spatially Scalable Generative Modeling
by: Cao, Jinqi, et al.
Published: (2026)
by: Cao, Jinqi, et al.
Published: (2026)
TriDF: Triplane-Accelerated Density Fields for Few-Shot Remote Sensing Novel View Synthesis
by: Kang, Jiaming, et al.
Published: (2025)
by: Kang, Jiaming, et al.
Published: (2025)
BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting
by: Wu, Yongchang, et al.
Published: (2025)
by: Wu, Yongchang, et al.
Published: (2025)
Code2Worlds: Empowering Coding LLMs for 4D World Generation
by: Zhang, Yi, et al.
Published: (2026)
by: Zhang, Yi, et al.
Published: (2026)
The Moderating Effect of Informal Institutions: Clans and Straw Burning in China
by: Liang Tang, et al.
Published: (2025)
by: Liang Tang, et al.
Published: (2025)
MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph
by: Li, Manyu, et al.
Published: (2026)
by: Li, Manyu, et al.
Published: (2026)
WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction
by: Liu, Chengzhi, et al.
Published: (2026)
by: Liu, Chengzhi, et al.
Published: (2026)
Unbiased Dynamic Multimodal Fusion
by: Wei, Shicai, et al.
Published: (2026)
by: Wei, Shicai, et al.
Published: (2026)
MotionScape: A Large-Scale Real-World Highly Dynamic UAV Video Dataset for World Models
by: Guo, Zile, et al.
Published: (2026)
by: Guo, Zile, et al.
Published: (2026)
Efficient Semantic Splatting for Remote Sensing Multi-view Segmentation
by: Qi, Zipeng, et al.
Published: (2024)
by: Qi, Zipeng, et al.
Published: (2024)
ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection
by: Sun, Zhihao, et al.
Published: (2024)
by: Sun, Zhihao, et al.
Published: (2024)
Synergistic double‐doped elastic composites for durable, ultra‐flexible sign language translation sensors
by: Tongshun Wu, et al.
Published: (2025)
by: Tongshun Wu, et al.
Published: (2025)
Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis
by: Liu, Chenyang, et al.
Published: (2024)
by: Liu, Chenyang, et al.
Published: (2024)
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models
by: Zhou, Runjie, et al.
Published: (2026)
by: Zhou, Runjie, et al.
Published: (2026)
The Complex and Challenging World of the Host–Pathogen Interaction
by: Marcel I. Ramirez
Published: (2024)
by: Marcel I. Ramirez
Published: (2024)
Spatial-Temporal Human-Object Interaction Detection
by: Sun, Xu, et al.
Published: (2025)
by: Sun, Xu, et al.
Published: (2025)
SWAN: World-Aware Adaptive Multimodal Networks for Runtime Variations
by: Wu, Jason, et al.
Published: (2026)
by: Wu, Jason, et al.
Published: (2026)
BiTAgent: A Task-Aware Modular Framework for Bidirectional Coupling between Multimodal Large Language Models and World Models
by: Zhan, Yu-Wei, et al.
Published: (2025)
by: Zhan, Yu-Wei, et al.
Published: (2025)
OCC-MLLM:Empowering Multimodal Large Language Model For the Understanding of Occluded Objects
by: Qiu, Wenmo, et al.
Published: (2024)
by: Qiu, Wenmo, et al.
Published: (2024)
WSSM: Geographic-enhanced hierarchical state-space model for global station weather forecast
by: Yang, Songru, et al.
Published: (2025)
by: Yang, Songru, et al.
Published: (2025)
CloudMamba: An Uncertainty-Guided Dual-Scale Mamba Network for Cloud Detection in Remote Sensing Imagery
by: Yang, Jiajun, et al.
Published: (2026)
by: Yang, Jiajun, et al.
Published: (2026)
Empowering Multi-Robot Cooperation via Sequential World Models
by: Zhao, Zijie, et al.
Published: (2025)
by: Zhao, Zijie, et al.
Published: (2025)
RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control
by: Li, Teng, et al.
Published: (2025)
by: Li, Teng, et al.
Published: (2025)
A Performance Investigation of Multimodal Multiobjective Optimization Algorithms in Solving Two Types of Real-World Problems
by: Chen, Zhiqiu, et al.
Published: (2024)
by: Chen, Zhiqiu, et al.
Published: (2024)
Kolmogorov Arnold Neural Interpolator for Downscaling and Correcting Meteorological Fields from In-Situ Observations
by: Liu, Zili, et al.
Published: (2025)
by: Liu, Zili, et al.
Published: (2025)
CDMamba: Incorporating Local Clues into Mamba for Remote Sensing Image Binary Change Detection
by: Zhang, Haotian, et al.
Published: (2024)
by: Zhang, Haotian, et al.
Published: (2024)
FoBa: A Foreground-Background co-Guided Method and New Benchmark for Remote Sensing Semantic Change Detection
by: Zhang, Haotian, et al.
Published: (2025)
by: Zhang, Haotian, et al.
Published: (2025)
Matrix-Game: Interactive World Foundation Model
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Phase Repair for Time-Domain Convolutional Neural Networks in Music Super-Resolution
by: Zhang, Yenan, et al.
Published: (2023)
by: Zhang, Yenan, et al.
Published: (2023)
Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
by: Zhu, Muzhi, et al.
Published: (2025)
by: Zhu, Muzhi, et al.
Published: (2025)
Olaf-World: Orienting Latent Actions for Video World Modeling
by: Jiang, Yuxin, et al.
Published: (2026)
by: Jiang, Yuxin, et al.
Published: (2026)
GigaWorld-0: World Models as Data Engine to Empower Embodied AI
by: GigaWorld Team, et al.
Published: (2025)
by: GigaWorld Team, et al.
Published: (2025)
Harnessing Multimodal Large Language Models for Personalized Product Search with Query-aware Refinement
by: Zhang, Beibei, et al.
Published: (2025)
by: Zhang, Beibei, et al.
Published: (2025)
Les Dissonances: Cross-Tool Harvesting and Polluting in Pool-of-Tools Empowered LLM Agents
by: Li, Zichuan, et al.
Published: (2025)
by: Li, Zichuan, et al.
Published: (2025)
Lattice Boltzmann Model for Learning Real-World Pixel Dynamicity
by: Zheng, Guangze, et al.
Published: (2025)
by: Zheng, Guangze, et al.
Published: (2025)
WorldMark: A Unified Benchmark Suite for Interactive Video World Models
by: Xu, Xiaojie, et al.
Published: (2026)
by: Xu, Xiaojie, et al.
Published: (2026)
OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model
by: Zhang, Zhenhao, et al.
Published: (2025)
by: Zhang, Zhenhao, et al.
Published: (2025)
Empowering Teachers to Build a Better World
Published: (2020)
Published: (2020)
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
by: Guo, Junliang, et al.
Published: (2025)
by: Guo, Junliang, et al.
Published: (2025)
Similar Items
-
WorldGPT: Empowering LLM as Multimodal World Model
by: Ge, Zhiqi, et al.
Published: (2024) -
MetaEarth3D: Unlocking World-scale 3D Generation with Spatially Scalable Generative Modeling
by: Cao, Jinqi, et al.
Published: (2026) -
TriDF: Triplane-Accelerated Density Fields for Few-Shot Remote Sensing Novel View Synthesis
by: Kang, Jiaming, et al.
Published: (2025) -
BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting
by: Wu, Yongchang, et al.
Published: (2025) -
Code2Worlds: Empowering Coding LLMs for 4D World Generation
by: Zhang, Yi, et al.
Published: (2026)