Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.03173 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916780279070720 |
|---|---|
| author | Liu, Xiaoyi Tang, Hao |
| author_facet | Liu, Xiaoyi Tang, Hao |
| contents | Physical intelligence -- anticipating and shaping the world from partial, multisensory observations -- is critical for next-generation world models. We propose FOLIAGE, a physics-informed multimodal world model for unbounded accretive surface growth. In its Action-Perception loop, a unified context encoder maps images, mesh connectivity, and point clouds to a shared latent state. A physics-aware predictor, conditioned on physical control actions, advances this latent state in time to align with the target latent of the surface, yielding a Modality-Agnostic Growth Embedding (MAGE) that interfaces with critic heads for downstream objectives. FOLIAGE's Accretive Graph Network (AGN) captures dynamic connectivity through Age Positional Encoding and Energy-Gated Message-Passing. Geometry-Correspondence Fusion and Cross-Patch Masking enhance MAGE's expressiveness, while Hierarchical Pooling balances global context with local dynamics. We create SURF-GARDEN, a world model learning platform comprising a Counterfactual Physics Simulator, a Multimodal Correspondence Extractor, and Evolution Tracing, which generates 7,200 diverse surface-growth sequences. SURF-BENCH, our physical-intelligence evaluation suite, evaluates six core tasks -- topology recognition, inverse material estimation, growth-stage classification, latent roll-out, cross-modal retrieval, and dense correspondence -- and four stress tests -- sensor dropout, zero-shot modality transfer, long-horizon prediction, and physics ablation -- to probe resilience. FOLIAGE outperforms specialized baselines while remaining robust across dynamic environments, establishing a new world-model based, multimodal pathway to physical intelligence. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2506_03173 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution Liu, Xiaoyi Tang, Hao Computer Vision and Pattern Recognition Artificial Intelligence Physical intelligence -- anticipating and shaping the world from partial, multisensory observations -- is critical for next-generation world models. We propose FOLIAGE, a physics-informed multimodal world model for unbounded accretive surface growth. In its Action-Perception loop, a unified context encoder maps images, mesh connectivity, and point clouds to a shared latent state. A physics-aware predictor, conditioned on physical control actions, advances this latent state in time to align with the target latent of the surface, yielding a Modality-Agnostic Growth Embedding (MAGE) that interfaces with critic heads for downstream objectives. FOLIAGE's Accretive Graph Network (AGN) captures dynamic connectivity through Age Positional Encoding and Energy-Gated Message-Passing. Geometry-Correspondence Fusion and Cross-Patch Masking enhance MAGE's expressiveness, while Hierarchical Pooling balances global context with local dynamics. We create SURF-GARDEN, a world model learning platform comprising a Counterfactual Physics Simulator, a Multimodal Correspondence Extractor, and Evolution Tracing, which generates 7,200 diverse surface-growth sequences. SURF-BENCH, our physical-intelligence evaluation suite, evaluates six core tasks -- topology recognition, inverse material estimation, growth-stage classification, latent roll-out, cross-modal retrieval, and dense correspondence -- and four stress tests -- sensor dropout, zero-shot modality transfer, long-horizon prediction, and physics ablation -- to probe resilience. FOLIAGE outperforms specialized baselines while remaining robust across dynamic environments, establishing a new world-model based, multimodal pathway to physical intelligence. |
| title | FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution |
| topic | Computer Vision and Pattern Recognition Artificial Intelligence |
| url | https://arxiv.org/abs/2506.03173 |