Saved in:
| Main Authors: | Sun, Guanxiong, Hua, Yang, Hu, Guosheng, Robertson, Neil |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.09257 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient One-stage Video Object Detection by Exploiting Temporal Consistency
by: Sun, Guanxiong, et al.
Published: (2024)
by: Sun, Guanxiong, et al.
Published: (2024)
MAMBA: Multi-level Aggregation via Memory Bank for Video Object Detection
by: Sun, Guanxiong, et al.
Published: (2024)
by: Sun, Guanxiong, et al.
Published: (2024)
Spatio-temporal Prompting Network for Robust Video Feature Extraction
by: Sun, Guanxiong, et al.
Published: (2024)
by: Sun, Guanxiong, et al.
Published: (2024)
FTDMamba: Frequency-Assisted Temporal Dilation Mamba for Unmanned Aerial Vehicle Video Anomaly Detection
by: Liu, Cheng-Zhuang, et al.
Published: (2026)
by: Liu, Cheng-Zhuang, et al.
Published: (2026)
Sparse-Dense Side-Tuner for efficient Video Temporal Grounding
by: Pujol-Perich, David, et al.
Published: (2025)
by: Pujol-Perich, David, et al.
Published: (2025)
Exploring Temporal Event Cues for Dense Video Captioning in Cyclic Co-learning
by: Xie, Zhuyang, et al.
Published: (2024)
by: Xie, Zhuyang, et al.
Published: (2024)
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks
by: Yang, Min, et al.
Published: (2024)
by: Yang, Min, et al.
Published: (2024)
Moment Quantization for Video Temporal Grounding
by: Sun, Xiaolong, et al.
Published: (2025)
by: Sun, Xiaolong, et al.
Published: (2025)
Unified Dense Prediction of Video Diffusion
by: Yang, Lehan, et al.
Published: (2025)
by: Yang, Lehan, et al.
Published: (2025)
Self-Diffusion Driven Blind Imaging
by: Yang, Yanlong, et al.
Published: (2025)
by: Yang, Yanlong, et al.
Published: (2025)
STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
by: Chen, Junyang, et al.
Published: (2025)
by: Chen, Junyang, et al.
Published: (2025)
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
by: Wu, Ziyi, et al.
Published: (2025)
by: Wu, Ziyi, et al.
Published: (2025)
Number it: Temporal Grounding Videos like Flipping Manga
by: Wu, Yongliang, et al.
Published: (2024)
by: Wu, Yongliang, et al.
Published: (2024)
Dense Video Object Captioning from Disjoint Supervision
by: Zhou, Xingyi, et al.
Published: (2023)
by: Zhou, Xingyi, et al.
Published: (2023)
Streaming Dense Video Captioning
by: Zhou, Xingyi, et al.
Published: (2024)
by: Zhou, Xingyi, et al.
Published: (2024)
Task Indicating Transformer for Task-conditional Dense Predictions
by: Lu, Yuxiang, et al.
Published: (2024)
by: Lu, Yuxiang, et al.
Published: (2024)
Video-Language Alignment via Spatio-Temporal Graph Transformer
by: Zhang, Shi-Xue, et al.
Published: (2024)
by: Zhang, Shi-Xue, et al.
Published: (2024)
DeCo: Decoupled Human-Centered Diffusion Video Editing with Motion Consistency
by: Zhong, Xiaojing, et al.
Published: (2024)
by: Zhong, Xiaojing, et al.
Published: (2024)
TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos
by: Kong, Fanheng, et al.
Published: (2025)
by: Kong, Fanheng, et al.
Published: (2025)
Emergent Temporal Correspondences from Video Diffusion Transformers
by: Nam, Jisu, et al.
Published: (2025)
by: Nam, Jisu, et al.
Published: (2025)
VideoCoF: Unified Video Editing with Temporal Reasoner
by: Yang, Xiangpeng, et al.
Published: (2025)
by: Yang, Xiangpeng, et al.
Published: (2025)
Dynamic View Synthesis from Small Camera Motion Videos
by: Sun, Huiqiang, et al.
Published: (2025)
by: Sun, Huiqiang, et al.
Published: (2025)
Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity
by: Xi, Haocheng, et al.
Published: (2025)
by: Xi, Haocheng, et al.
Published: (2025)
DeVAn: Dense Video Annotation for Video-Language Models
by: Liu, Tingkai, et al.
Published: (2023)
by: Liu, Tingkai, et al.
Published: (2023)
Technical Report for Soccernet 2023 -- Dense Video Captioning
by: Ruan, Zheng, et al.
Published: (2024)
by: Ruan, Zheng, et al.
Published: (2024)
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
by: Lee, Ji Soo, et al.
Published: (2025)
by: Lee, Ji Soo, et al.
Published: (2025)
TemporalVLM: Video LLMs for Temporal Reasoning in Long Videos
by: Fateh, Fawad Javed, et al.
Published: (2024)
by: Fateh, Fawad Javed, et al.
Published: (2024)
TA-Prompting: Enhancing Video Large Language Models for Dense Video Captioning via Temporal Anchors
by: Cheng, Wei-Yuan, et al.
Published: (2026)
by: Cheng, Wei-Yuan, et al.
Published: (2026)
Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos
by: Jiang, Songtao, et al.
Published: (2026)
by: Jiang, Songtao, et al.
Published: (2026)
Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos
by: Luo, Xianrui, et al.
Published: (2025)
by: Luo, Xianrui, et al.
Published: (2025)
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
by: Ma, Junxian, et al.
Published: (2025)
by: Ma, Junxian, et al.
Published: (2025)
SmartSight: Mitigating Hallucination in Video-LLMs Without Compromising Video Understanding via Temporal Attention Collapse
by: Sun, Yiming, et al.
Published: (2025)
by: Sun, Yiming, et al.
Published: (2025)
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
by: Cheng, Zixu, et al.
Published: (2025)
by: Cheng, Zixu, et al.
Published: (2025)
Splatter a Video: Video Gaussian Representation for Versatile Processing
by: Sun, Yang-Tian, et al.
Published: (2024)
by: Sun, Yang-Tian, et al.
Published: (2024)
VideoCompressa: Data-Efficient Video Understanding via Joint Temporal Compression and Spatial Reconstruction
by: Wang, Shaobo, et al.
Published: (2025)
by: Wang, Shaobo, et al.
Published: (2025)
Explicit Temporal-Semantic Modeling for Dense Video Captioning via Context-Aware Cross-Modal Interaction
by: Jia, Mingda, et al.
Published: (2025)
by: Jia, Mingda, et al.
Published: (2025)
TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
by: Yang, Zuhao, et al.
Published: (2025)
by: Yang, Zuhao, et al.
Published: (2025)
Subjective Portrait Region Cropping in Landscape Videos with Temporal Annotation Smoothing
by: Lee, Cheng-Han, et al.
Published: (2026)
by: Lee, Cheng-Han, et al.
Published: (2026)
Adaptive Dense Evidence Refinement for Video Relational Reasoning for VRR-QA Challenge
by: Sun, Yuyang, et al.
Published: (2026)
by: Sun, Yuyang, et al.
Published: (2026)
Described Spatial-Temporal Video Detection
by: Ji, Wei, et al.
Published: (2024)
by: Ji, Wei, et al.
Published: (2024)
Similar Items
-
Efficient One-stage Video Object Detection by Exploiting Temporal Consistency
by: Sun, Guanxiong, et al.
Published: (2024) -
MAMBA: Multi-level Aggregation via Memory Bank for Video Object Detection
by: Sun, Guanxiong, et al.
Published: (2024) -
Spatio-temporal Prompting Network for Robust Video Feature Extraction
by: Sun, Guanxiong, et al.
Published: (2024) -
FTDMamba: Frequency-Assisted Temporal Dilation Mamba for Unmanned Aerial Vehicle Video Anomaly Detection
by: Liu, Cheng-Zhuang, et al.
Published: (2026) -
Sparse-Dense Side-Tuner for efficient Video Temporal Grounding
by: Pujol-Perich, David, et al.
Published: (2025)