:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Kianpisheh, Mohammad
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2502.05457
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PREGEN: Uncovering Latent Thoughts in Composed Video Retrieval
by: Serussi, Gabriele, et al.
Published: (2026)

Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval
by: Liu, Haowei, et al.
Published: (2024)

Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition
by: Yu, Sihyun, et al.
Published: (2024)

LTX-Video: Realtime Video Latent Diffusion
by: HaCohen, Yoav, et al.
Published: (2024)

Long Video Understanding with Learnable Retrieval in Video-Language Models
by: Xu, Jiaqi, et al.
Published: (2023)

Simplifying Traffic Anomaly Detection with Video Foundation Models
by: Orlova, Svetlana, et al.
Published: (2025)

CoVA: Text-Guided Composed Video Retrieval for Audio-Visual Content
by: Han, Gyuwon, et al.
Published: (2026)

Video Generation Models Are Good Latent Reward Models
by: Mi, Xiaoyue, et al.
Published: (2025)

InterAct-Video: Reasoning-Rich Video QA for Urban Traffic
by: Vishal, Joseph Raj, et al.
Published: (2025)

Nip Rumors in the Bud: Retrieval-Guided Topic-Level Adaptation for Test-Time Fake News Video Detection
by: Lang, Jian, et al.
Published: (2026)

Denoise-then-Retrieve: Text-Conditioned Video Denoising for Video Moment Retrieval
by: Liu, Weijia, et al.
Published: (2025)

Latent Video Dataset Distillation
by: Li, Ning, et al.
Published: (2025)

Video Generation with Predictive Latents
by: Zhao, Yian, et al.
Published: (2026)

LVMark: Robust Watermark for Latent Video Diffusion Models
by: Jang, MinHyuk, et al.
Published: (2024)

Multimodal Lengthy Videos Retrieval Framework and Evaluation Metric
by: Eltahir, Mohamed, et al.
Published: (2025)

ReVideo: Remake a Video with Motion and Content Control
by: Mou, Chong, et al.
Published: (2024)

Detection of Micromobility Vehicles in Urban Traffic Videos
by: Sabri, Khalil, et al.
Published: (2024)

Improved Video VAE for Latent Video Diffusion Model
by: Wu, Pingyu, et al.
Published: (2024)

Latent Space Probing for Adult Content Detection in Video Generative Models
by: Khatri, Alizishaan, et al.
Published: (2026)

Adversarial Video Promotion Against Text-to-Video Retrieval
by: Tian, Qiwei, et al.
Published: (2025)

Video Editing for Video Retrieval
by: Zhu, Bin, et al.
Published: (2024)

Video-based Pedestrian and Vehicle Traffic Analysis During Football Games
by: Fleischer, Jacques P., et al.
Published: (2024)

ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction
by: Islam, Md Zabirul, et al.
Published: (2025)

TrafficLens: Multi-Camera Traffic Video Analysis Using LLMs
by: Arefeen, Md Adnan, et al.
Published: (2025)

Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
by: Liu, Haitong, et al.
Published: (2025)

VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
by: Yuan, Zhihang, et al.
Published: (2025)

LatentColorization: Latent Diffusion-Based Speaker Video Colorization
by: Ward, Rory, et al.
Published: (2024)

Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion
by: Ahmed, Syed Hammad, et al.
Published: (2024)

Motion-aware Latent Diffusion Models for Video Frame Interpolation
by: Huang, Zhilin, et al.
Published: (2024)

Seer: Language Instructed Video Prediction with Latent Diffusion Models
by: Gu, Xianfan, et al.
Published: (2023)

DrVideo: Document Retrieval Based Long Video Understanding
by: Ma, Ziyu, et al.
Published: (2024)

VideoStudio: Generating Consistent-Content and Multi-Scene Videos
by: Long, Fuchen, et al.
Published: (2024)

TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes
by: Zhou, Xingcheng, et al.
Published: (2025)

Reasoning Text-to-Video Retrieval via Digital Twin Video Representations and Large Language Models
by: Shen, Yiqing, et al.
Published: (2025)

VisTopics: A Visual Semantic Unsupervised Approach to Topic Modeling of Video and Image Data
by: Lokmanoglu, Ayse D, et al.
Published: (2025)

Mining Platoon Patterns from Traffic Videos
by: Bei, Yijun, et al.
Published: (2024)

Latte: Latent Diffusion Transformer for Video Generation
by: Ma, Xin, et al.
Published: (2024)

Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
by: Di, Shangzhe, et al.
Published: (2025)

Video-based Traffic Light Recognition by Rockchip RV1126 for Autonomous Driving
by: Fan, Miao, et al.
Published: (2025)

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
by: Jin, Xiaojie, et al.
Published: (2023)