Saved in:
| Main Author: | Huang, Jiantang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.01512 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Zero-Shot Action Recognition in Surveillance Videos
by: Pereira, Joao, et al.
Published: (2024)
by: Pereira, Joao, et al.
Published: (2024)
Slow - Motion Video Synthesis for Basketball Using Frame Interpolation
by: Huang, Jiantang
Published: (2025)
by: Huang, Jiantang
Published: (2025)
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
by: Yang, Shuai, et al.
Published: (2024)
by: Yang, Shuai, et al.
Published: (2024)
A Modular Zero-Shot Pipeline for Accident Detection, Localization, and Classification in Traffic Surveillance Video
by: Thakur, Amey, et al.
Published: (2026)
by: Thakur, Amey, et al.
Published: (2026)
Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
by: Yang, Zaiquan, et al.
Published: (2025)
by: Yang, Zaiquan, et al.
Published: (2025)
TAG: A Simple Yet Effective Temporal-Aware Approach for Zero-Shot Video Temporal Grounding
by: Lee, Jin-Seop, et al.
Published: (2025)
by: Lee, Jin-Seop, et al.
Published: (2025)
VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT
by: Xu, Yifang, et al.
Published: (2024)
by: Xu, Yifang, et al.
Published: (2024)
A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing
by: Li, Maomao, et al.
Published: (2023)
by: Li, Maomao, et al.
Published: (2023)
T2SGrid: Temporal-to-Spatial Gridification for Video Temporal Grounding
by: Guo, Chaohong, et al.
Published: (2026)
by: Guo, Chaohong, et al.
Published: (2026)
Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM
by: Kamboj, Payal, et al.
Published: (2025)
by: Kamboj, Payal, et al.
Published: (2025)
GRAZE: Grounded Refinement and Motion-Aware Zero-Shot Event Localization
by: Zaidi, Syed Ahsan Masud, et al.
Published: (2026)
by: Zaidi, Syed Ahsan Masud, et al.
Published: (2026)
Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding
by: Xiong, Yuanhao, et al.
Published: (2023)
by: Xiong, Yuanhao, et al.
Published: (2023)
TRACE: Temporal Grounding Video LLM via Causal Event Modeling
by: Guo, Yongxin, et al.
Published: (2024)
by: Guo, Yongxin, et al.
Published: (2024)
VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing
by: Couairon, Paul, et al.
Published: (2023)
by: Couairon, Paul, et al.
Published: (2023)
GraphThinker: Reinforcing Temporally Grounded Video Reasoning with Event Graph Thinking
by: Cheng, Zixu, et al.
Published: (2026)
by: Cheng, Zixu, et al.
Published: (2026)
Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding
by: Zheng, Minghang, et al.
Published: (2025)
by: Zheng, Minghang, et al.
Published: (2025)
Zero-Shot Temporal Interaction Localization for Egocentric Videos
by: Zhang, Erhang, et al.
Published: (2025)
by: Zhang, Erhang, et al.
Published: (2025)
EZSR: Event-based Zero-Shot Recognition
by: Yang, Yan, et al.
Published: (2024)
by: Yang, Yan, et al.
Published: (2024)
Context-Guided Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2024)
by: Gu, Xin, et al.
Published: (2024)
WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding
by: Kong, Quan, et al.
Published: (2024)
by: Kong, Quan, et al.
Published: (2024)
Scaling Zero-Shot Reference-to-Video Generation
by: Zhou, Zijian, et al.
Published: (2025)
by: Zhou, Zijian, et al.
Published: (2025)
Foresee-to-Ground: From Predictive Temporal Perception to Evidence-Driven Reasoning for Video Temporal Grounding
by: Zheng, Zelin, et al.
Published: (2026)
by: Zheng, Zelin, et al.
Published: (2026)
GroundingAnomaly: Spatially-Grounded Diffusion for Few-Shot Anomaly Synthesis
by: Liu, Yishen, et al.
Published: (2026)
by: Liu, Yishen, et al.
Published: (2026)
Multi-Stage VLM Pipeline for Zero-Shot Traffic Accident Understanding
by: Tatematsu, Fumiya, et al.
Published: (2026)
by: Tatematsu, Fumiya, et al.
Published: (2026)
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
by: Liao, Ruotong, et al.
Published: (2024)
by: Liao, Ruotong, et al.
Published: (2024)
iFinder: Structured Zero-Shot Vision-Based LLM Grounding for Dash-Cam Video Reasoning
by: Yao, Manyi, et al.
Published: (2025)
by: Yao, Manyi, et al.
Published: (2025)
Zero-Shot Video Deraining with Video Diffusion Models
by: Varanka, Tuomas, et al.
Published: (2025)
by: Varanka, Tuomas, et al.
Published: (2025)
TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
by: Xia, Yan, et al.
Published: (2024)
by: Xia, Yan, et al.
Published: (2024)
EvoGround: Self-Evolving Video Agents for Video Temporal Grounding
by: Jung, Minjoon, et al.
Published: (2026)
by: Jung, Minjoon, et al.
Published: (2026)
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
Temporal-Visual Semantic Alignment: A Unified Architecture for Transferring Spatial Priors from Vision Models to Zero-Shot Temporal Tasks
by: Ma, Xiangkai, et al.
Published: (2025)
by: Ma, Xiangkai, et al.
Published: (2025)
VISTA: Validation-Guided Integration of Spatial and Temporal Foundation Models with Anatomical Decoding for Rare-Pathology VCE Event Detection
by: Qiu, Bo-Cheng, et al.
Published: (2026)
by: Qiu, Bo-Cheng, et al.
Published: (2026)
Surveillance Video-Based Traffic Accident Detection Using Transformer Architecture
by: Singh, Tanu, et al.
Published: (2025)
by: Singh, Tanu, et al.
Published: (2025)
Test-Time Zero-Shot Temporal Action Localization
by: Liberatori, Benedetta, et al.
Published: (2024)
by: Liberatori, Benedetta, et al.
Published: (2024)
MVP: Motion Vector Propagation for Zero-Shot Video Object Detection
by: Huang, Binhua, et al.
Published: (2025)
by: Huang, Binhua, et al.
Published: (2025)
Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos
by: Zaheer, Muhammad Zaigham, et al.
Published: (2022)
by: Zaheer, Muhammad Zaigham, et al.
Published: (2022)
Towards Long-Form Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2026)
by: Gu, Xin, et al.
Published: (2026)
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
by: Cohen, Nathaniel, et al.
Published: (2024)
by: Cohen, Nathaniel, et al.
Published: (2024)
Zero-TIG: Temporal Consistency-Aware Zero-Shot Illumination-Guided Low-light Video Enhancement
by: Li, Yini, et al.
Published: (2025)
by: Li, Yini, et al.
Published: (2025)
Similar Items
-
Zero-Shot Action Recognition in Surveillance Videos
by: Pereira, Joao, et al.
Published: (2024) -
Slow - Motion Video Synthesis for Basketball Using Frame Interpolation
by: Huang, Jiantang
Published: (2025) -
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
by: Yang, Shuai, et al.
Published: (2024) -
A Modular Zero-Shot Pipeline for Accident Detection, Localization, and Classification in Traffic Surveillance Video
by: Thakur, Amey, et al.
Published: (2026) -
Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
by: Yang, Shuai, et al.
Published: (2025)