Saved in:
| Main Authors: | Li, Longfei, Fan, Zhiwen, Cong, Wenyan, Liu, Xinhang, Yin, Yuyang, Foutter, Matt, Pan, Panwang, You, Chenyu, Wang, Yue, Wang, Zhangyang, Zhao, Yao, Pavone, Marco, Wei, Yunchao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.07978 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models
by: Cong, Wenyan, et al.
Published: (2025)
by: Cong, Wenyan, et al.
Published: (2025)
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency
by: Yin, Yuyang, et al.
Published: (2023)
by: Yin, Yuyang, et al.
Published: (2023)
InstantSplat: Sparse-view Gaussian Splatting in Seconds
by: Fan, Zhiwen, et al.
Published: (2024)
by: Fan, Zhiwen, et al.
Published: (2024)
Can Test-Time Scaling Improve World Foundation Model?
by: Cong, Wenyan, et al.
Published: (2025)
by: Cong, Wenyan, et al.
Published: (2025)
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
by: Cong, Wenyan, et al.
Published: (2025)
by: Cong, Wenyan, et al.
Published: (2025)
Large Spatial Model: End-to-end Unposed Images to Semantic 3D
by: Fan, Zhiwen, et al.
Published: (2024)
by: Fan, Zhiwen, et al.
Published: (2024)
RoboMonkey: Scaling Test-Time Sampling and Verification for Vision-Language-Action Models
by: Kwok, Jacky, et al.
Published: (2025)
by: Kwok, Jacky, et al.
Published: (2025)
StereoWorld: Geometry-Aware Monocular-to-Stereo Video Generation
by: Xing, Ke, et al.
Published: (2025)
by: Xing, Ke, et al.
Published: (2025)
Real-Time Anomaly Detection and Reactive Planning with Large Language Models
by: Sinha, Rohan, et al.
Published: (2024)
by: Sinha, Rohan, et al.
Published: (2024)
Diffusion4D: Fast Spatial-temporal Consistent 4D Generation via Video Diffusion Models
by: Liang, Hanwen, et al.
Published: (2024)
by: Liang, Hanwen, et al.
Published: (2024)
Learning Traffic Crashes as Language: Datasets, Benchmarks, and What-if Causal Analyses
by: Fan, Zhiwen, et al.
Published: (2024)
by: Fan, Zhiwen, et al.
Published: (2024)
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
by: Zhu, Zehao, et al.
Published: (2023)
by: Zhu, Zehao, et al.
Published: (2023)
Realistic Extreme Behavior Generation for Improved AV Testing
by: Dyro, Robert, et al.
Published: (2024)
by: Dyro, Robert, et al.
Published: (2024)
Vision Foundation Model Embedding-Based Semantic Anomaly Detection
by: Ronecker, Max Peter, et al.
Published: (2025)
by: Ronecker, Max Peter, et al.
Published: (2025)
PanoWorld-X: Generating Explorable Panoramic Worlds via Sphere-Aware Video Diffusion
by: Yin, Yuyang, et al.
Published: (2025)
by: Yin, Yuyang, et al.
Published: (2025)
4K4DGen: Panoramic 4D Generation at 4K Resolution
by: Li, Renjie, et al.
Published: (2024)
by: Li, Renjie, et al.
Published: (2024)
DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling
by: Wen, Kairun, et al.
Published: (2025)
by: Wen, Kairun, et al.
Published: (2025)
SpatialTree: How Spatial Abilities Branch Out in MLLMs
by: Xiao, Yuxi, et al.
Published: (2025)
by: Xiao, Yuxi, et al.
Published: (2025)
Space-LLaVA: a Vision-Language Model Adapted to Extraterrestrial Applications
by: Foutter, Matthew, et al.
Published: (2024)
by: Foutter, Matthew, et al.
Published: (2024)
ReachBot Field Tests in a Mojave Desert Lava Tube as a Martian Analog
by: Chen, Tony G., et al.
Published: (2024)
by: Chen, Tony G., et al.
Published: (2024)
Egocentric World Model for Photorealistic Hand-Object Interaction Synthesis
by: Li, Dayou, et al.
Published: (2026)
by: Li, Dayou, et al.
Published: (2026)
Martian Exploration of Lava Tubes (MELT) with ReachBot: Scientific Investigation and Concept of Operations
by: Di, Julia, et al.
Published: (2024)
by: Di, Julia, et al.
Published: (2024)
INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing
by: Abi-Karam, Stefan, et al.
Published: (2023)
by: Abi-Karam, Stefan, et al.
Published: (2023)
CIPHER: Culvert Inspection through Pairwise Frame Selection and High-Efficiency Reconstruction
by: Lee, Seoyoung, et al.
Published: (2026)
by: Lee, Seoyoung, et al.
Published: (2026)
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow
by: Yin, Li, et al.
Published: (2025)
by: Yin, Li, et al.
Published: (2025)
PlainQAFact: Retrieval-augmented Factual Consistency Evaluation Metric for Biomedical Plain Language Summarization
by: You, Zhiwen, et al.
Published: (2025)
by: You, Zhiwen, et al.
Published: (2025)
GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting
by: Li, Chenxin, et al.
Published: (2024)
by: Li, Chenxin, et al.
Published: (2024)
CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy
by: Zhang, Jiakai, et al.
Published: (2025)
by: Zhang, Jiakai, et al.
Published: (2025)
Expressive Gaussian Human Avatars from Monocular RGB Video
by: Hu, Hezhen, et al.
Published: (2024)
by: Hu, Hezhen, et al.
Published: (2024)
Extrapolated Urban View Synthesis Benchmark
by: Han, Xiangyu, et al.
Published: (2024)
by: Han, Xiangyu, et al.
Published: (2024)
InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior
by: Lin, Chenguo, et al.
Published: (2024)
by: Lin, Chenguo, et al.
Published: (2024)
PACE: Pacing Operator Learning to Accurate Optical Field Simulation for Complicated Photonic Devices
by: Zhu, Hanqing, et al.
Published: (2024)
by: Zhu, Hanqing, et al.
Published: (2024)
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
by: Zhou, Shijie, et al.
Published: (2025)
by: Zhou, Shijie, et al.
Published: (2025)
Scale Where It Matters: Training-Free Localized Scaling for Diffusion Models
by: Ren, Qin, et al.
Published: (2025)
by: Ren, Qin, et al.
Published: (2025)
A Stabilized High‐Order Spectral Model With Adaptive Residual‐Based Artificial Viscosity for Fully‐Nonlinear Free‐Surface Flow
by: Longfei Cong, et al.
Published: (2025)
by: Longfei Cong, et al.
Published: (2025)
InfoAffect: Affective Annotations of Infographics in Information Spread
by: Fu, Zihang, et al.
Published: (2025)
by: Fu, Zihang, et al.
Published: (2025)
Enhance-A-Video: Better Generated Video for Free
by: Luo, Yang, et al.
Published: (2025)
by: Luo, Yang, et al.
Published: (2025)
APOLLO: SGD-like Memory, AdamW-level Performance
by: Zhu, Hanqing, et al.
Published: (2024)
by: Zhu, Hanqing, et al.
Published: (2024)
HumanCrafter: Synergizing Generalizable Human Reconstruction and Semantic 3D Segmentation
by: Pan, Panwang, et al.
Published: (2025)
by: Pan, Panwang, et al.
Published: (2025)
Characterizing the current systems in the Martian ionosphere
by: Gao, Jiawei, et al.
Published: (2024)
by: Gao, Jiawei, et al.
Published: (2024)
Similar Items
-
E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models
by: Cong, Wenyan, et al.
Published: (2025) -
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency
by: Yin, Yuyang, et al.
Published: (2023) -
InstantSplat: Sparse-view Gaussian Splatting in Seconds
by: Fan, Zhiwen, et al.
Published: (2024) -
Can Test-Time Scaling Improve World Foundation Model?
by: Cong, Wenyan, et al.
Published: (2025) -
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
by: Cong, Wenyan, et al.
Published: (2025)