Saved in:
| Main Authors: | Tomar, Manan, Hansen-Estruch, Philippe, Bachman, Philip, Lamb, Alex, Langford, John, Taylor, Matthew E., Levine, Sergey |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.09533 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Unified Auto-Encoding with Masked Diffusion
by: Hansen-Estruch, Philippe, et al.
Published: (2024)
by: Hansen-Estruch, Philippe, et al.
Published: (2024)
Towards Principled Representation Learning from Videos for Reinforcement Learning
by: Misra, Dipendra, et al.
Published: (2024)
by: Misra, Dipendra, et al.
Published: (2024)
Apollo: An Exploration of Video Understanding in Large Multimodal Models
by: Zohar, Orr, et al.
Published: (2024)
by: Zohar, Orr, et al.
Published: (2024)
Unified Text-Image Generation with Weakness-Targeted Post-Training
by: Chen, Jiahui, et al.
Published: (2026)
by: Chen, Jiahui, et al.
Published: (2026)
Fast Occupancy Network
by: Lu, Mingjie, et al.
Published: (2024)
by: Lu, Mingjie, et al.
Published: (2024)
Vision-Language Models Provide Promptable Representations for Reinforcement Learning
by: Chen, William, et al.
Published: (2024)
by: Chen, William, et al.
Published: (2024)
ViTok-v2: Scaling Native Resolution Auto-Encoders to 5 Billion Parameters
by: Hansen-Estruch, Philippe, et al.
Published: (2026)
by: Hansen-Estruch, Philippe, et al.
Published: (2026)
Phi-4-reasoning-vision-15B Technical Report
by: Aneja, Jyoti, et al.
Published: (2026)
by: Aneja, Jyoti, et al.
Published: (2026)
Social LSTM with Dynamic Occupancy Modeling for Realistic Pedestrian Trajectory Prediction
by: Alia, Ahmed, et al.
Published: (2025)
by: Alia, Ahmed, et al.
Published: (2025)
Training Diffusion Models with Reinforcement Learning
by: Black, Kevin, et al.
Published: (2023)
by: Black, Kevin, et al.
Published: (2023)
Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots
by: Cui, Wei, et al.
Published: (2025)
by: Cui, Wei, et al.
Published: (2025)
CRUNet-MR-Univ: A Foundation Model for Diverse Cardiac MRI Reconstruction
by: Lyu, Donghang, et al.
Published: (2026)
by: Lyu, Donghang, et al.
Published: (2026)
Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification
by: Shah, Manan, et al.
Published: (2024)
by: Shah, Manan, et al.
Published: (2024)
FloodVision: Urban Flood Depth Estimation Using Foundation Vision-Language Models and Domain Knowledge Graph
by: Liu, Zhangding, et al.
Published: (2025)
by: Liu, Zhangding, et al.
Published: (2025)
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
by: Wang, Lening, et al.
Published: (2024)
by: Wang, Lening, et al.
Published: (2024)
Deep Radar Inverse Sensor Models for Dynamic Occupancy Grid Maps
by: Wei, Zihang, et al.
Published: (2023)
by: Wei, Zihang, et al.
Published: (2023)
OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving
by: Shen, Yedong, et al.
Published: (2025)
by: Shen, Yedong, et al.
Published: (2025)
ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
by: Li, Ying, et al.
Published: (2025)
by: Li, Ying, et al.
Published: (2025)
Learning Additively Compositional Latent Actions for Embodied AI
by: Wei, Hangxing, et al.
Published: (2026)
by: Wei, Hangxing, et al.
Published: (2026)
OFMPNet: Deep End-to-End Model for Occupancy and Flow Prediction in Urban Environment
by: Murhij, Youshaa, et al.
Published: (2024)
by: Murhij, Youshaa, et al.
Published: (2024)
OccSim: Multi-kilometer Simulation with Long-horizon Occupancy World Models
by: Liu, Tianran, et al.
Published: (2026)
by: Liu, Tianran, et al.
Published: (2026)
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
by: Li, Runjia, et al.
Published: (2025)
by: Li, Runjia, et al.
Published: (2025)
Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
by: Chen, Dubing, et al.
Published: (2025)
by: Chen, Dubing, et al.
Published: (2025)
KP-INR: A Dual-Branch Implicit Neural Representation Model for Cardiac Cine MRI Reconstruction
by: Lyu, Donghang, et al.
Published: (2025)
by: Lyu, Donghang, et al.
Published: (2025)
Video Motion Transfer with Diffusion Transformers
by: Pondaven, Alexander, et al.
Published: (2024)
by: Pondaven, Alexander, et al.
Published: (2024)
Interpreting Physics in Video World Models
by: Joseph, Sonia, et al.
Published: (2026)
by: Joseph, Sonia, et al.
Published: (2026)
SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World
by: Chen, Chen, et al.
Published: (2025)
by: Chen, Chen, et al.
Published: (2025)
CVT-Occ: Cost Volume Temporal Fusion for 3D Occupancy Prediction
by: Ye, Zhangchen, et al.
Published: (2024)
by: Ye, Zhangchen, et al.
Published: (2024)
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction
by: Yan, Chi, et al.
Published: (2025)
by: Yan, Chi, et al.
Published: (2025)
Revisit Human-Scene Interaction via Space Occupancy
by: Liu, Xinpeng, et al.
Published: (2023)
by: Liu, Xinpeng, et al.
Published: (2023)
Multi-Label Classification Framework for Hurricane Damage Assessment
by: Liu, Zhangding, et al.
Published: (2025)
by: Liu, Zhangding, et al.
Published: (2025)
MCANet: A Multi-Scale Class-Specific Attention Network for Multi-Label Post-Hurricane Damage Assessment using UAV Imagery
by: Liu, Zhangding, et al.
Published: (2025)
by: Liu, Zhangding, et al.
Published: (2025)
SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries
by: Dang, Chenxu, et al.
Published: (2025)
by: Dang, Chenxu, et al.
Published: (2025)
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
by: Huang, Yuanhui, et al.
Published: (2024)
by: Huang, Yuanhui, et al.
Published: (2024)
GSRender: Deduplicated Occupancy Prediction via Weakly Supervised 3D Gaussian Splatting
by: Sun, Qianpu, et al.
Published: (2024)
by: Sun, Qianpu, et al.
Published: (2024)
Gamified crowd-sourcing of high-quality data for visual fine-tuning
by: Yadav, Shashank, et al.
Published: (2024)
by: Yadav, Shashank, et al.
Published: (2024)
Latent Gaussian Splatting for 4D Panoptic Occupancy Tracking
by: Luz, Maximilian, et al.
Published: (2026)
by: Luz, Maximilian, et al.
Published: (2026)
QYOLO: Lightweight Object Detection via Quantum Inspired Shared Channel Mixing
by: Mittal, Garvit Kumar, et al.
Published: (2026)
by: Mittal, Garvit Kumar, et al.
Published: (2026)
SparseOccVLA: Bridging Occupancy and Vision-Language Models via Sparse Queries for Unified 4D Scene Understanding and Planning
by: Dang, Chenxu, et al.
Published: (2026)
by: Dang, Chenxu, et al.
Published: (2026)
3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation
by: Oh, Gyeongrok, et al.
Published: (2025)
by: Oh, Gyeongrok, et al.
Published: (2025)
Similar Items
-
Unified Auto-Encoding with Masked Diffusion
by: Hansen-Estruch, Philippe, et al.
Published: (2024) -
Towards Principled Representation Learning from Videos for Reinforcement Learning
by: Misra, Dipendra, et al.
Published: (2024) -
Apollo: An Exploration of Video Understanding in Large Multimodal Models
by: Zohar, Orr, et al.
Published: (2024) -
Unified Text-Image Generation with Weakness-Targeted Post-Training
by: Chen, Jiahui, et al.
Published: (2026) -
Fast Occupancy Network
by: Lu, Mingjie, et al.
Published: (2024)