Saved in:
| Main Authors: | Etaat, Daniel, Kalaria, Dvij, Rahmanian, Nima, Sastry, Shankar |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.20936 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TT4D: A Pipeline and Dataset for Table Tennis 4D Reconstruction From Monocular Videos
by: Rahmanian, Nima, et al.
Published: (2026)
by: Rahmanian, Nima, et al.
Published: (2026)
Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling
by: Liao, Haicheng, et al.
Published: (2024)
by: Liao, Haicheng, et al.
Published: (2024)
Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer
by: Kienzle, Daniel, et al.
Published: (2025)
by: Kienzle, Daniel, et al.
Published: (2025)
On Moving Object Segmentation from Monocular Video with Transformers
by: Homeyer, Christian, et al.
Published: (2024)
by: Homeyer, Christian, et al.
Published: (2024)
α-RACER: Real-Time Algorithm for Game-Theoretic Motion Planning and Control in Autonomous Racing using Near-Potential Function
by: Kalaria, Dvij, et al.
Published: (2024)
by: Kalaria, Dvij, et al.
Published: (2024)
GFlow: Recovering 4D World from Monocular Video
by: Wang, Shizun, et al.
Published: (2024)
by: Wang, Shizun, et al.
Published: (2024)
Towards Scene Graph Anticipation
by: Peddi, Rohith, et al.
Published: (2024)
by: Peddi, Rohith, et al.
Published: (2024)
VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model
by: Zuo, Qi, et al.
Published: (2024)
by: Zuo, Qi, et al.
Published: (2024)
MV-S2V: Multi-View Subject-Consistent Video Generation
by: Song, Ziyang, et al.
Published: (2026)
by: Song, Ziyang, et al.
Published: (2026)
Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation
by: Luo, Yuanhao, et al.
Published: (2026)
by: Luo, Yuanhao, et al.
Published: (2026)
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
by: Lv, Zhen, et al.
Published: (2024)
by: Lv, Zhen, et al.
Published: (2024)
iHuman: Instant Animatable Digital Humans From Monocular Videos
by: Paudel, Pramish, et al.
Published: (2024)
by: Paudel, Pramish, et al.
Published: (2024)
Automated Tennis Player and Ball Tracking with Court Keypoints Detection (Hawk Eye System)
by: Desu, Venkata Manikanta, et al.
Published: (2025)
by: Desu, Venkata Manikanta, et al.
Published: (2025)
Semantically Guided Representation Learning For Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)
by: Diko, Anxhelo, et al.
Published: (2024)
VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video
by: Liu, Yu, et al.
Published: (2025)
by: Liu, Yu, et al.
Published: (2025)
Endo3R: Unified Online Reconstruction from Dynamic Monocular Endoscopic Video
by: Guo, Jiaxin, et al.
Published: (2025)
by: Guo, Jiaxin, et al.
Published: (2025)
MV-RAG: Retrieval Augmented Multiview Diffusion
by: Dayani, Yosef, et al.
Published: (2025)
by: Dayani, Yosef, et al.
Published: (2025)
Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation Challenge
by: Cho, Hyunjin, et al.
Published: (2024)
by: Cho, Hyunjin, et al.
Published: (2024)
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos
by: Liang, Hanxue, et al.
Published: (2024)
by: Liang, Hanxue, et al.
Published: (2024)
Semantically Guided Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)
by: Diko, Anxhelo, et al.
Published: (2024)
MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer
by: Sarker, Sushmita, et al.
Published: (2024)
by: Sarker, Sushmita, et al.
Published: (2024)
4D3R: Motion-Aware Neural Reconstruction and Rendering of Dynamic Scenes from Monocular Videos
by: Guo, Mengqi, et al.
Published: (2025)
by: Guo, Mengqi, et al.
Published: (2025)
Tamaththul3D: High-Fidelity 3D Saudi Sign Language Avatars from Monocular Video
by: Alghamdi, Eyad, et al.
Published: (2026)
by: Alghamdi, Eyad, et al.
Published: (2026)
Predict and Resist: Long-Term Accident Anticipation under Sensor Noise
by: Liu, Xingcheng, et al.
Published: (2025)
by: Liu, Xingcheng, et al.
Published: (2025)
Intention-Guided Cognitive Reasoning for Egocentric Long-Term Action Anticipation
by: Chu, Qiaohui, et al.
Published: (2025)
by: Chu, Qiaohui, et al.
Published: (2025)
GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion
by: Tang, Jiapeng, et al.
Published: (2024)
by: Tang, Jiapeng, et al.
Published: (2024)
The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise
by: Ban, Yuanhao, et al.
Published: (2024)
by: Ban, Yuanhao, et al.
Published: (2024)
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
by: Fang, Jianwu, et al.
Published: (2025)
by: Fang, Jianwu, et al.
Published: (2025)
LATTE: Latent Trajectory Embedding for Diffusion-Generated Image Detection
by: Vasilcoiu, Ana, et al.
Published: (2025)
by: Vasilcoiu, Ana, et al.
Published: (2025)
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
by: YU, Mark, et al.
Published: (2025)
by: YU, Mark, et al.
Published: (2025)
MsFIN: Multi-scale Feature Interaction Network for Traffic Accident Anticipation
by: Wu, Tongshuai, et al.
Published: (2025)
by: Wu, Tongshuai, et al.
Published: (2025)
Focusable Monocular Depth Estimation
by: Du, Yuxin, et al.
Published: (2026)
by: Du, Yuxin, et al.
Published: (2026)
GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video
by: Chen, Jingxuan
Published: (2024)
by: Chen, Jingxuan
Published: (2024)
DiffVAS: Diffusion-Guided Visual Active Search in Partially Observable Environments
by: Sarkar, Anindya, et al.
Published: (2026)
by: Sarkar, Anindya, et al.
Published: (2026)
GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video
by: Sharma, Arun
Published: (2026)
by: Sharma, Arun
Published: (2026)
LATTE: Learning to Think with Vision Specialists
by: Ma, Zixian, et al.
Published: (2024)
by: Ma, Zixian, et al.
Published: (2024)
Comparing Learning Paradigms for Egocentric Video Summarization
by: Wen, Daniel
Published: (2025)
by: Wen, Daniel
Published: (2025)
Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025
by: Chu, Qiaohui, et al.
Published: (2025)
by: Chu, Qiaohui, et al.
Published: (2025)
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds
by: Tang, Zhenggang, et al.
Published: (2024)
by: Tang, Zhenggang, et al.
Published: (2024)
MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation
by: Cheng, Jintao, et al.
Published: (2024)
by: Cheng, Jintao, et al.
Published: (2024)
Similar Items
-
TT4D: A Pipeline and Dataset for Table Tennis 4D Reconstruction From Monocular Videos
by: Rahmanian, Nima, et al.
Published: (2026) -
Real-time Accident Anticipation for Autonomous Driving Through Monocular Depth-Enhanced 3D Modeling
by: Liao, Haicheng, et al.
Published: (2024) -
Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer
by: Kienzle, Daniel, et al.
Published: (2025) -
On Moving Object Segmentation from Monocular Video with Transformers
by: Homeyer, Christian, et al.
Published: (2024) -
α-RACER: Real-Time Algorithm for Game-Theoretic Motion Planning and Control in Autonomous Racing using Near-Potential Function
by: Kalaria, Dvij, et al.
Published: (2024)