:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Xunnong, Cao, Mengying
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2412.09828
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model
by: Duan, Yifan, et al.
Published: (2024)

Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers
by: Xie, Jinxia, et al.
Published: (2024)

Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
by: Samuel, Dvir, et al.
Published: (2026)

Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
by: Jindal, Swati, et al.
Published: (2024)

Spatio-Temporal Attention for Consistent Video Semantic Segmentation in Automated Driving
by: Varghese, Serin, et al.
Published: (2026)

Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model
by: Li, Peiyan, et al.
Published: (2026)

Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)

Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation
by: Li, Qirui, et al.
Published: (2025)

VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion
by: Tang, Linfeng, et al.
Published: (2025)

Resolving Spatio-Temporal Entanglement in Video Prediction via Multi-Modal Attention
by: Gupta, Shreyam, et al.
Published: (2025)

CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
by: Li, Lingen, et al.
Published: (2026)

DVFace: Spatio-Temporal Dual-Prior Diffusion for Video Face Restoration
by: Chen, Zheng, et al.
Published: (2026)

Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context
by: Shen, Cuifeng, et al.
Published: (2025)

Deepfake Detection with Spatio-Temporal Consistency and Attention
by: Chen, Yunzhuo, et al.
Published: (2025)

Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
by: Zhen, Dingcheng, et al.
Published: (2025)

MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives
by: Meng, Yihao, et al.
Published: (2026)

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
by: Xu, Dejia, et al.
Published: (2024)

Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection
by: Shen, Hao, et al.
Published: (2024)

CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering
by: Zhang, Mingfang, et al.
Published: (2026)

Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024)

STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
by: Chen, Junyang, et al.
Published: (2025)

Context-Guided Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2024)

Causal Motion Diffusion Models for Autoregressive Motion Generation
by: Yu, Qing, et al.
Published: (2026)

Spatio-Temporal Garment Reconstruction Using Diffusion Mapping via Pattern Coordinates
by: You, Yingxuan, et al.
Published: (2026)

Video-Language Alignment via Spatio-Temporal Graph Transformer
by: Zhang, Shi-Xue, et al.
Published: (2024)

SpotFormer: Multi-Scale Spatio-Temporal Transformer for Facial Expression Spotting
by: Deng, Yicheng, et al.
Published: (2024)

AI-Generated Video Detection via Spatio-Temporal Anomaly Learning
by: Bai, Jianfa, et al.
Published: (2024)

DIFFUMA: High-Fidelity Spatio-Temporal Video Prediction via Dual-Path Mamba and Diffusion Enhancement
by: Xie, Xinyu, et al.
Published: (2025)

Mining Multi-Modality Spatio-Temporal Cues for Video Important Person Identification
by: Wang, Xiao, et al.
Published: (2026)

Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)

Progressive Autoregressive Video Diffusion Models
by: Xie, Desai, et al.
Published: (2024)

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)

TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes
by: Zhou, Xingcheng, et al.
Published: (2025)

STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits
by: Papantoniou, Foivos Paraperas, et al.
Published: (2025)

A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI
by: Pérez-Toro, Paula Andrea, et al.
Published: (2025)

SpatioTemporal Learning for Human Pose Estimation in Sparsely-Labeled Videos
by: Jiao, Yingying, et al.
Published: (2025)

SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention
by: Meeran, Muhammad Nawfal, et al.
Published: (2024)

Towards Long-Form Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2026)

VISTA: Video Interaction Spatio-Temporal Analysis Benchmark
by: Aparcedo, Alejandro, et al.
Published: (2026)