Saved in:
| Main Authors: | Xu, Xunnong, Cao, Mengying |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.09828 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model
by: Duan, Yifan, et al.
Published: (2024)
by: Duan, Yifan, et al.
Published: (2024)
Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers
by: Xie, Jinxia, et al.
Published: (2024)
by: Xie, Jinxia, et al.
Published: (2024)
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
by: Samuel, Dvir, et al.
Published: (2026)
by: Samuel, Dvir, et al.
Published: (2026)
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
by: Jindal, Swati, et al.
Published: (2024)
by: Jindal, Swati, et al.
Published: (2024)
Spatio-Temporal Attention for Consistent Video Semantic Segmentation in Automated Driving
by: Varghese, Serin, et al.
Published: (2026)
by: Varghese, Serin, et al.
Published: (2026)
Multi-View Video Diffusion Policy: A 3D Spatio-Temporal-Aware Video Action Model
by: Li, Peiyan, et al.
Published: (2026)
by: Li, Peiyan, et al.
Published: (2026)
Light Forcing: Accelerating Autoregressive Video Diffusion via Sparse Attention
by: Lv, Chengtao, et al.
Published: (2026)
by: Lv, Chengtao, et al.
Published: (2026)
Compact Attention: Exploiting Structured Spatio-Temporal Sparsity for Fast Video Generation
by: Li, Qirui, et al.
Published: (2025)
by: Li, Qirui, et al.
Published: (2025)
VideoFusion: A Spatio-Temporal Collaborative Network for Multi-modal Video Fusion
by: Tang, Linfeng, et al.
Published: (2025)
by: Tang, Linfeng, et al.
Published: (2025)
Resolving Spatio-Temporal Entanglement in Video Prediction via Multi-Modal Attention
by: Gupta, Shreyam, et al.
Published: (2025)
by: Gupta, Shreyam, et al.
Published: (2025)
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video
by: Li, Lingen, et al.
Published: (2026)
by: Li, Lingen, et al.
Published: (2026)
DVFace: Spatio-Temporal Dual-Prior Diffusion for Video Face Restoration
by: Chen, Zheng, et al.
Published: (2026)
by: Chen, Zheng, et al.
Published: (2026)
Autoregressive Video Autoencoder with Decoupled Temporal and Spatial Context
by: Shen, Cuifeng, et al.
Published: (2025)
by: Shen, Cuifeng, et al.
Published: (2025)
Deepfake Detection with Spatio-Temporal Consistency and Attention
by: Chen, Yunzhuo, et al.
Published: (2025)
by: Chen, Yunzhuo, et al.
Published: (2025)
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression
by: Zhen, Dingcheng, et al.
Published: (2025)
by: Zhen, Dingcheng, et al.
Published: (2025)
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale
by: Liu, Haozhe, et al.
Published: (2024)
by: Liu, Haozhe, et al.
Published: (2024)
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives
by: Meng, Yihao, et al.
Published: (2026)
by: Meng, Yihao, et al.
Published: (2026)
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
by: Xu, Dejia, et al.
Published: (2024)
by: Xu, Dejia, et al.
Published: (2024)
Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection
by: Shen, Hao, et al.
Published: (2024)
by: Shen, Hao, et al.
Published: (2024)
CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering
by: Zhang, Mingfang, et al.
Published: (2026)
by: Zhang, Mingfang, et al.
Published: (2026)
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024)
by: Gao, Kaifeng, et al.
Published: (2024)
STCDiT: Spatio-Temporally Consistent Diffusion Transformer for High-Quality Video Super-Resolution
by: Chen, Junyang, et al.
Published: (2025)
by: Chen, Junyang, et al.
Published: (2025)
Context-Guided Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2024)
by: Gu, Xin, et al.
Published: (2024)
Causal Motion Diffusion Models for Autoregressive Motion Generation
by: Yu, Qing, et al.
Published: (2026)
by: Yu, Qing, et al.
Published: (2026)
Spatio-Temporal Garment Reconstruction Using Diffusion Mapping via Pattern Coordinates
by: You, Yingxuan, et al.
Published: (2026)
by: You, Yingxuan, et al.
Published: (2026)
Video-Language Alignment via Spatio-Temporal Graph Transformer
by: Zhang, Shi-Xue, et al.
Published: (2024)
by: Zhang, Shi-Xue, et al.
Published: (2024)
SpotFormer: Multi-Scale Spatio-Temporal Transformer for Facial Expression Spotting
by: Deng, Yicheng, et al.
Published: (2024)
by: Deng, Yicheng, et al.
Published: (2024)
AI-Generated Video Detection via Spatio-Temporal Anomaly Learning
by: Bai, Jianfa, et al.
Published: (2024)
by: Bai, Jianfa, et al.
Published: (2024)
DIFFUMA: High-Fidelity Spatio-Temporal Video Prediction via Dual-Path Mamba and Diffusion Enhancement
by: Xie, Xinyu, et al.
Published: (2025)
by: Xie, Xinyu, et al.
Published: (2025)
Mining Multi-Modality Spatio-Temporal Cues for Video Important Person Identification
by: Wang, Xiao, et al.
Published: (2026)
by: Wang, Xiao, et al.
Published: (2026)
Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation
by: Zhao, Min, et al.
Published: (2026)
by: Zhao, Min, et al.
Published: (2026)
Progressive Autoregressive Video Diffusion Models
by: Xie, Desai, et al.
Published: (2024)
by: Xie, Desai, et al.
Published: (2024)
Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)
by: Liu, Kunhao, et al.
Published: (2025)
TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes
by: Zhou, Xingcheng, et al.
Published: (2025)
by: Zhou, Xingcheng, et al.
Published: (2025)
STARCaster: Spatio-Temporal AutoRegressive Video Diffusion for Identity- and View-Aware Talking Portraits
by: Papantoniou, Foivos Paraperas, et al.
Published: (2025)
by: Papantoniou, Foivos Paraperas, et al.
Published: (2025)
A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI
by: Pérez-Toro, Paula Andrea, et al.
Published: (2025)
by: Pérez-Toro, Paula Andrea, et al.
Published: (2025)
SpatioTemporal Learning for Human Pose Estimation in Sparsely-Labeled Videos
by: Jiao, Yingying, et al.
Published: (2025)
by: Jiao, Yingying, et al.
Published: (2025)
SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention
by: Meeran, Muhammad Nawfal, et al.
Published: (2024)
by: Meeran, Muhammad Nawfal, et al.
Published: (2024)
Towards Long-Form Spatio-Temporal Video Grounding
by: Gu, Xin, et al.
Published: (2026)
by: Gu, Xin, et al.
Published: (2026)
VISTA: Video Interaction Spatio-Temporal Analysis Benchmark
by: Aparcedo, Alejandro, et al.
Published: (2026)
by: Aparcedo, Alejandro, et al.
Published: (2026)
Similar Items
-
Causal Deciphering and Inpainting in Spatio-Temporal Dynamics via Diffusion Model
by: Duan, Yifan, et al.
Published: (2024) -
Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers
by: Xie, Jinxia, et al.
Published: (2024) -
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
by: Samuel, Dvir, et al.
Published: (2026) -
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
by: Jindal, Swati, et al.
Published: (2024) -
Spatio-Temporal Attention for Consistent Video Semantic Segmentation in Automated Driving
by: Varghese, Serin, et al.
Published: (2026)