Saved in:
| Main Author: | Kianpisheh, Mohammad |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.05457 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PREGEN: Uncovering Latent Thoughts in Composed Video Retrieval
by: Serussi, Gabriele, et al.
Published: (2026)
by: Serussi, Gabriele, et al.
Published: (2026)
Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval
by: Liu, Haowei, et al.
Published: (2024)
by: Liu, Haowei, et al.
Published: (2024)
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
by: Yu, Sihyun, et al.
Published: (2024)
by: Yu, Sihyun, et al.
Published: (2024)
LTX-Video: Realtime Video Latent Diffusion
by: HaCohen, Yoav, et al.
Published: (2024)
by: HaCohen, Yoav, et al.
Published: (2024)
Long Video Understanding with Learnable Retrieval in Video-Language Models
by: Xu, Jiaqi, et al.
Published: (2023)
by: Xu, Jiaqi, et al.
Published: (2023)
Simplifying Traffic Anomaly Detection with Video Foundation Models
by: Orlova, Svetlana, et al.
Published: (2025)
by: Orlova, Svetlana, et al.
Published: (2025)
CoVA: Text-Guided Composed Video Retrieval for Audio-Visual Content
by: Han, Gyuwon, et al.
Published: (2026)
by: Han, Gyuwon, et al.
Published: (2026)
Video Generation Models Are Good Latent Reward Models
by: Mi, Xiaoyue, et al.
Published: (2025)
by: Mi, Xiaoyue, et al.
Published: (2025)
InterAct-Video: Reasoning-Rich Video QA for Urban Traffic
by: Vishal, Joseph Raj, et al.
Published: (2025)
by: Vishal, Joseph Raj, et al.
Published: (2025)
Nip Rumors in the Bud: Retrieval-Guided Topic-Level Adaptation for Test-Time Fake News Video Detection
by: Lang, Jian, et al.
Published: (2026)
by: Lang, Jian, et al.
Published: (2026)
Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment Retrieval
by: Liu, Weijia, et al.
Published: (2025)
by: Liu, Weijia, et al.
Published: (2025)
Latent Video Dataset Distillation
by: Li, Ning, et al.
Published: (2025)
by: Li, Ning, et al.
Published: (2025)
Video Generation with Predictive Latents
by: Zhao, Yian, et al.
Published: (2026)
by: Zhao, Yian, et al.
Published: (2026)
LVMark: Robust Watermark for Latent Video Diffusion Models
by: Jang, MinHyuk, et al.
Published: (2024)
by: Jang, MinHyuk, et al.
Published: (2024)
Multimodal Lengthy Videos Retrieval Framework and Evaluation Metric
by: Eltahir, Mohamed, et al.
Published: (2025)
by: Eltahir, Mohamed, et al.
Published: (2025)
ReVideo: Remake a Video with Motion and Content Control
by: Mou, Chong, et al.
Published: (2024)
by: Mou, Chong, et al.
Published: (2024)
Detection of Micromobility Vehicles in Urban Traffic Videos
by: Sabri, Khalil, et al.
Published: (2024)
by: Sabri, Khalil, et al.
Published: (2024)
Improved Video VAE for Latent Video Diffusion Model
by: Wu, Pingyu, et al.
Published: (2024)
by: Wu, Pingyu, et al.
Published: (2024)
Latent Space Probing for Adult Content Detection in Video Generative Models
by: Khatri, Alizishaan, et al.
Published: (2026)
by: Khatri, Alizishaan, et al.
Published: (2026)
Adversarial Video Promotion Against Text-to-Video Retrieval
by: Tian, Qiwei, et al.
Published: (2025)
by: Tian, Qiwei, et al.
Published: (2025)
Video Editing for Video Retrieval
by: Zhu, Bin, et al.
Published: (2024)
by: Zhu, Bin, et al.
Published: (2024)
Video-based Pedestrian and Vehicle Traffic Analysis During Football Games
by: Fleischer, Jacques P., et al.
Published: (2024)
by: Fleischer, Jacques P., et al.
Published: (2024)
ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction
by: Islam, Md Zabirul, et al.
Published: (2025)
by: Islam, Md Zabirul, et al.
Published: (2025)
TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
by: Arefeen, Md Adnan, et al.
Published: (2025)
by: Arefeen, Md Adnan, et al.
Published: (2025)
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
by: Liu, Haitong, et al.
Published: (2025)
by: Liu, Haitong, et al.
Published: (2025)
VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
by: Yuan, Zhihang, et al.
Published: (2025)
by: Yuan, Zhihang, et al.
Published: (2025)
LatentColorization: Latent Diffusion-Based Speaker Video Colorization
by: Ward, Rory, et al.
Published: (2024)
by: Ward, Rory, et al.
Published: (2024)
Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion
by: Ahmed, Syed Hammad, et al.
Published: (2024)
by: Ahmed, Syed Hammad, et al.
Published: (2024)
Motion-aware Latent Diffusion Models for Video Frame Interpolation
by: Huang, Zhilin, et al.
Published: (2024)
by: Huang, Zhilin, et al.
Published: (2024)
Seer: Language Instructed Video Prediction with Latent Diffusion Models
by: Gu, Xianfan, et al.
Published: (2023)
by: Gu, Xianfan, et al.
Published: (2023)
DrVideo: Document Retrieval Based Long Video Understanding
by: Ma, Ziyu, et al.
Published: (2024)
by: Ma, Ziyu, et al.
Published: (2024)
VideoStudio: Generating Consistent-Content and Multi-Scene Videos
by: Long, Fuchen, et al.
Published: (2024)
by: Long, Fuchen, et al.
Published: (2024)
TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes
by: Zhou, Xingcheng, et al.
Published: (2025)
by: Zhou, Xingcheng, et al.
Published: (2025)
Reasoning Text-to-Video Retrieval via Digital Twin Video Representations and Large Language Models
by: Shen, Yiqing, et al.
Published: (2025)
by: Shen, Yiqing, et al.
Published: (2025)
VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data
by: Lokmanoglu, Ayse D, et al.
Published: (2025)
by: Lokmanoglu, Ayse D, et al.
Published: (2025)
Mining Platoon Patterns from Traffic Videos
by: Bei, Yijun, et al.
Published: (2024)
by: Bei, Yijun, et al.
Published: (2024)
Latte: Latent Diffusion Transformer for Video Generation
by: Ma, Xin, et al.
Published: (2024)
by: Ma, Xin, et al.
Published: (2024)
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
by: Di, Shangzhe, et al.
Published: (2025)
by: Di, Shangzhe, et al.
Published: (2025)
Video-based Traffic Light Recognition by Rockchip RV1126 for Autonomous Driving
by: Fan, Miao, et al.
Published: (2025)
by: Fan, Miao, et al.
Published: (2025)
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
by: Jin, Xiaojie, et al.
Published: (2023)
by: Jin, Xiaojie, et al.
Published: (2023)
Similar Items
-
PREGEN: Uncovering Latent Thoughts in Composed Video Retrieval
by: Serussi, Gabriele, et al.
Published: (2026) -
Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval
by: Liu, Haowei, et al.
Published: (2024) -
Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
by: Yu, Sihyun, et al.
Published: (2024) -
LTX-Video: Realtime Video Latent Diffusion
by: HaCohen, Yoav, et al.
Published: (2024) -
Long Video Understanding with Learnable Retrieval in Video-Language Models
by: Xu, Jiaqi, et al.
Published: (2023)