Saved in:
| Main Authors: | Lillemark, Hansen Jin, Huang, Benhao, Zhan, Fangneng, Du, Yilun, Keller, Thomas Anderson |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.01075 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AAPMT: AGI Assessment Through Prompt and Metric Transformer
by: Huang, Benhao
Published: (2024)
by: Huang, Benhao
Published: (2024)
Flow Equivariant Recurrent Neural Networks
by: Keller, T. Anderson
Published: (2025)
by: Keller, T. Anderson
Published: (2025)
MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models
by: Ji, Xinlong, et al.
Published: (2024)
by: Ji, Xinlong, et al.
Published: (2024)
Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
by: Liu, Yifan, et al.
Published: (2025)
by: Liu, Yifan, et al.
Published: (2025)
GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation
by: Zhou, Kaichen, et al.
Published: (2026)
by: Zhou, Kaichen, et al.
Published: (2026)
AdaWorld: Learning Adaptable World Models with Latent Actions
by: Gao, Shenyuan, et al.
Published: (2025)
by: Gao, Shenyuan, et al.
Published: (2025)
AREA3D: Active Reconstruction Agent with Unified Feed-Forward 3D Perception and Vision-Language Guidance
by: Xu, Tianling, et al.
Published: (2025)
by: Xu, Tianling, et al.
Published: (2025)
Defining and Extracting generalizable interaction primitives from DNNs
by: Chen, Lu, et al.
Published: (2024)
by: Chen, Lu, et al.
Published: (2024)
Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models
by: Wang, Runqian, et al.
Published: (2025)
by: Wang, Runqian, et al.
Published: (2025)
HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments
by: Zhou, Qinhong, et al.
Published: (2024)
by: Zhou, Qinhong, et al.
Published: (2024)
Equivariant Reinforcement Learning under Partial Observability
by: Nguyen, Hai, et al.
Published: (2024)
by: Nguyen, Hai, et al.
Published: (2024)
Long-Text-to-Image Generation via Compositional Prompt Decomposition
by: Huang, Jen-Yuan, et al.
Published: (2026)
by: Huang, Jen-Yuan, et al.
Published: (2026)
PPU-Bench:Real World Benchmark for Personalized Partial Unlearning in Vision Language Models
by: Guang, Jiahui, et al.
Published: (2026)
by: Guang, Jiahui, et al.
Published: (2026)
Compositional Generative Modeling: A Single Model is Not All You Need
by: Du, Yilun, et al.
Published: (2024)
by: Du, Yilun, et al.
Published: (2024)
Stream3D: Sequential Multi-View 3D Generation via Evidential Memory
by: Zhou, Kaichen, et al.
Published: (2026)
by: Zhou, Kaichen, et al.
Published: (2026)
Grounding Video Models to Actions through Goal Conditioned Exploration
by: Luo, Yunhao, et al.
Published: (2024)
by: Luo, Yunhao, et al.
Published: (2024)
Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment
by: Lee, Taekbeom, et al.
Published: (2024)
by: Lee, Taekbeom, et al.
Published: (2024)
Equivariant Flow Matching for Point Cloud Assembly
by: Wang, Ziming, et al.
Published: (2025)
by: Wang, Ziming, et al.
Published: (2025)
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
by: Yang, Yuncong, et al.
Published: (2025)
by: Yang, Yuncong, et al.
Published: (2025)
DiffAge3D: Diffusion-based 3D-aware Face Aging
by: Wahid, Junaid, et al.
Published: (2024)
by: Wahid, Junaid, et al.
Published: (2024)
SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models
by: Chen, Kaijin, et al.
Published: (2026)
by: Chen, Kaijin, et al.
Published: (2026)
Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
by: Kim, Hyunsu, et al.
Published: (2024)
by: Kim, Hyunsu, et al.
Published: (2024)
SPIE: Semantic and Structural Post-Training of Image Editing Diffusion Models with AI feedback
by: Benarous, Elior, et al.
Published: (2025)
by: Benarous, Elior, et al.
Published: (2025)
Video as the New Language for Real-World Decision Making
by: Yang, Sherry, et al.
Published: (2024)
by: Yang, Sherry, et al.
Published: (2024)
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments
by: Kwon, Soonwoo, et al.
Published: (2025)
by: Kwon, Soonwoo, et al.
Published: (2025)
General Neural Gauge Fields
by: Zhan, Fangneng, et al.
Published: (2023)
by: Zhan, Fangneng, et al.
Published: (2023)
seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models
by: Ghaemi, Hafez, et al.
Published: (2025)
by: Ghaemi, Hafez, et al.
Published: (2025)
Large-scale Reinforcement Learning for Diffusion Models
by: Zhang, Yinan, et al.
Published: (2024)
by: Zhang, Yinan, et al.
Published: (2024)
Visual Acoustic Fields
by: Li, Yuelei, et al.
Published: (2025)
by: Li, Yuelei, et al.
Published: (2025)
Dynamic Orchestration of Multi-Agent System for Real-World Multi-Image Agricultural VQA
by: Ke, Yan, et al.
Published: (2025)
by: Ke, Yan, et al.
Published: (2025)
Ctrl-VI: Controllable Video Synthesis via Variational Inference
by: Duan, Haoyi, et al.
Published: (2025)
by: Duan, Haoyi, et al.
Published: (2025)
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
by: Zhang, Hongxin, et al.
Published: (2024)
by: Zhang, Hongxin, et al.
Published: (2024)
Observe-R1: Unlocking Reasoning Abilities of MLLMs with Dynamic Progressive Reinforcement Learning
by: Guo, Zirun, et al.
Published: (2025)
by: Guo, Zirun, et al.
Published: (2025)
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation
by: Gao, Qiyue, et al.
Published: (2025)
by: Gao, Qiyue, et al.
Published: (2025)
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields
by: Chi, Yu, et al.
Published: (2023)
by: Chi, Yu, et al.
Published: (2023)
FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization
by: Zhang, Jiahui, et al.
Published: (2024)
by: Zhang, Jiahui, et al.
Published: (2024)
MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
by: Xu, Muyu, et al.
Published: (2025)
by: Xu, Muyu, et al.
Published: (2025)
MIND: Benchmarking Memory Consistency and Action Control in World Models
by: Ye, Yixuan, et al.
Published: (2026)
by: Ye, Yixuan, et al.
Published: (2026)
3D-VLA: A 3D Vision-Language-Action Generative World Model
by: Zhen, Haoyu, et al.
Published: (2024)
by: Zhen, Haoyu, et al.
Published: (2024)
Similar Items
-
AAPMT: AGI Assessment Through Prompt and Metric Transformer
by: Huang, Benhao
Published: (2024) -
Flow Equivariant Recurrent Neural Networks
by: Keller, T. Anderson
Published: (2025) -
MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models
by: Ji, Xinlong, et al.
Published: (2024) -
Abstract 3D Perception for Spatial Intelligence in Vision-Language Models
by: Liu, Yifan, et al.
Published: (2025) -
GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation
by: Zhou, Kaichen, et al.
Published: (2026)