:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Li, Joshua, Cantu, Fernando Jose Pena, Yu, Emily, Wong, Alexander, Cui, Yuchen, Chen, Yuhao
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2504.07867
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Zero-Shot Object Re-Identification in Egocentric Kitchen Videos via Multi-Stage SAM3 Feature Fusion
di: Klepachevskyi, Dmytro, et al.
Pubblicazione: (2026)

Static Scene Reconstruction from Dynamic Egocentric Videos
di: Cui, Qifei, et al.
Pubblicazione: (2026)

Zero-Shot Temporal Interaction Localization for Egocentric Videos
di: Zhang, Erhang, et al.
Pubblicazione: (2025)

FoodTrack: Estimating Handheld Food Portions with Egocentric Video
di: Wang, Ervin, et al.
Pubblicazione: (2025)

ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation
di: Li, Hongjie, et al.
Pubblicazione: (2024)

EASG-Bench: Video Q&A Benchmark with Egocentric Action Scene Graphs
di: Rodin, Ivan, et al.
Pubblicazione: (2025)

KitchenTwin: Semantically and Geometrically Grounded 3D Kitchen Digital Twins
di: Wu, Quanyun, et al.
Pubblicazione: (2026)

Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
di: Yang, Zaiquan, et al.
Pubblicazione: (2025)

Instance Tracking in 3D Scenes from Egocentric Videos
di: Zhao, Yunhan, et al.
Pubblicazione: (2023)

Scaling Zero-Shot Reference-to-Video Generation
di: Zhou, Zijian, et al.
Pubblicazione: (2025)

Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos
di: Yuan, Chengbo, et al.
Pubblicazione: (2024)

DissolveStereo: Coarse Depth Injection for Zero-Shot Stereo Video Generation
di: Shi, Jian, et al.
Pubblicazione: (2024)

Object-Shot Enhanced Grounding Network for Egocentric Video
di: Feng, Yisen, et al.
Pubblicazione: (2025)

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
di: Fan, Yue, et al.
Pubblicazione: (2024)

HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model
di: Vo, Khoa, et al.
Pubblicazione: (2024)

EgoGraph: Temporal Knowledge Graph for Egocentric Video Understanding
di: Sun, Shitong, et al.
Pubblicazione: (2026)

Zero-Shot Video Deraining with Video Diffusion Models
di: Varanka, Tuomas, et al.
Pubblicazione: (2025)

The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning
di: Tian, Mingkai, et al.
Pubblicazione: (2025)

VideoPoet: A Large Language Model for Zero-Shot Video Generation
di: Kondratyuk, Dan, et al.
Pubblicazione: (2023)

VIZOR: Viewpoint-Invariant Zero-Shot Scene Graph Generation for 3D Scene Reasoning
di: Madhavaram, Vivek, et al.
Pubblicazione: (2026)

Zero-Shot Video Translation via Token Warping
di: Zhu, Haiming, et al.
Pubblicazione: (2024)

DIFFVSGG: Diffusion-Driven Online Video Scene Graph Generation
di: Chen, Mu, et al.
Pubblicazione: (2025)

Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation
di: Chen, Changgu, et al.
Pubblicazione: (2024)

DriveVA: Video Action Models are Zero-Shot Drivers
di: Liu, Mengmeng, et al.
Pubblicazione: (2026)

LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer
di: Chen, Changgu, et al.
Pubblicazione: (2025)

Retrieval-Augmented Egocentric Video Captioning
di: Xu, Jilan, et al.
Pubblicazione: (2024)

Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
di: Chen, Qirui, et al.
Pubblicazione: (2024)

EgoX: Egocentric Video Generation from a Single Exocentric Video
di: Kang, Taewoong, et al.
Pubblicazione: (2025)

FunRec: Reconstructing Functional 3D Scenes from Egocentric Interaction Videos
di: Delitzas, Alexandros, et al.
Pubblicazione: (2026)

Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators
di: Zhang, Wentao, et al.
Pubblicazione: (2024)

Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models
di: Cao, Cong, et al.
Pubblicazione: (2026)

EgoLoc: A Generalizable Solution for Temporal Interaction Localization in Egocentric Videos
di: Ma, Junyi, et al.
Pubblicazione: (2025)

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
di: He, Xuanhua, et al.
Pubblicazione: (2024)

LiveSVG: Zero-Shot SVG Animation via Video Generation
di: Levy, Matan, et al.
Pubblicazione: (2026)

Are Image-to-Video Models Good Zero-Shot Image Editors?
di: Zhang, Zechuan, et al.
Pubblicazione: (2025)

HyperGLM: HyperGraph for Video Scene Graph Generation and Anticipation
di: Nguyen, Trong-Thuan, et al.
Pubblicazione: (2024)

SceneGraphVLM: Dynamic Scene Graph Generation from Video with Vision-Language Models
di: Makarov, Vladislav, et al.
Pubblicazione: (2026)

FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
di: Yang, Shuai, et al.
Pubblicazione: (2024)

ShotDirector: Directorially Controllable Multi-Shot Video Generation with Cinematographic Transitions
di: Wu, Xiaoxue, et al.
Pubblicazione: (2025)

MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction
di: Li, Bate, et al.
Pubblicazione: (2025)