Saved in:
| Main Authors: | Li, Chengxuan, Huang, Di, Lu, Zeyu, Xiao, Yang, Pei, Qingqi, Bai, Lei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.16407 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights
by: Lei, Wentao, et al.
Published: (2024)
by: Lei, Wentao, et al.
Published: (2024)
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
by: Xie, Guohuan, et al.
Published: (2025)
by: Xie, Guohuan, et al.
Published: (2025)
Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
by: Yang, Huan, et al.
Published: (2024)
by: Yang, Huan, et al.
Published: (2024)
LongLive: Real-time Interactive Long Video Generation
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
by: Lu, Zhihe, et al.
Published: (2023)
by: Lu, Zhihe, et al.
Published: (2023)
Helios: Real Real-Time Long Video Generation Model
by: Yuan, Shenghai, et al.
Published: (2026)
by: Yuan, Shenghai, et al.
Published: (2026)
Pura: An Efficient Privacy-Preserving Solution for Face Recognition
by: Xu, Guotao, et al.
Published: (2025)
by: Xu, Guotao, et al.
Published: (2025)
VideoAuteur: Towards Long Narrative Video Generation
by: Xiao, Junfei, et al.
Published: (2025)
by: Xiao, Junfei, et al.
Published: (2025)
TokensGen: Harnessing Condensed Tokens for Long Video Generation
by: Ouyang, Wenqi, et al.
Published: (2025)
by: Ouyang, Wenqi, et al.
Published: (2025)
IPMix: Label-Preserving Data Augmentation Method for Training Robust Classifiers
by: Huang, Zhenglin, et al.
Published: (2023)
by: Huang, Zhenglin, et al.
Published: (2023)
FiT: Flexible Vision Transformer for Diffusion Model
by: Lu, Zeyu, et al.
Published: (2024)
by: Lu, Zeyu, et al.
Published: (2024)
Long Context Tuning for Video Generation
by: Guo, Yuwei, et al.
Published: (2025)
by: Guo, Yuwei, et al.
Published: (2025)
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
by: Xue, Xiangyuan, et al.
Published: (2024)
by: Xue, Xiangyuan, et al.
Published: (2024)
A Survey of AI-Generated Video Evaluation
by: Liu, Xiao, et al.
Published: (2024)
by: Liu, Xiao, et al.
Published: (2024)
A Survey on Visual Anomaly Detection: Challenge, Approach, and Prospect
by: Cao, Yunkang, et al.
Published: (2024)
by: Cao, Yunkang, et al.
Published: (2024)
FreeLong++: Training-Free Long Video Generation via Multi-band SpectralFusion
by: Lu, Yu, et al.
Published: (2025)
by: Lu, Yu, et al.
Published: (2025)
A Survey: Spatiotemporal Consistency in Video Generation
by: Yin, Zhiyu, et al.
Published: (2025)
by: Yin, Zhiyu, et al.
Published: (2025)
E-VRAG: Enhancing Long Video Understanding with Resource-Efficient Retrieval Augmented Generation
by: Xu, Zeyu, et al.
Published: (2025)
by: Xu, Zeyu, et al.
Published: (2025)
A Survey of Interactive Generative Video
by: Yu, Jiwen, et al.
Published: (2025)
by: Yu, Jiwen, et al.
Published: (2025)
Controllable Video Generation: A Survey
by: Ma, Yue, et al.
Published: (2025)
by: Ma, Yue, et al.
Published: (2025)
Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion
by: Ma, Yongjia, et al.
Published: (2025)
by: Ma, Yongjia, et al.
Published: (2025)
Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval
by: Yu, Jiwen, et al.
Published: (2025)
by: Yu, Jiwen, et al.
Published: (2025)
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality
by: Elmoghany, Mohamed, et al.
Published: (2025)
by: Elmoghany, Mohamed, et al.
Published: (2025)
GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?
by: Zou, Yueying, et al.
Published: (2026)
by: Zou, Yueying, et al.
Published: (2026)
PVUW 2024 Challenge on Complex Video Understanding: Methods and Results
by: Ding, Henghui, et al.
Published: (2024)
by: Ding, Henghui, et al.
Published: (2024)
FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention
by: Lu, Yu, et al.
Published: (2024)
by: Lu, Yu, et al.
Published: (2024)
Addressing the ID-Matching Challenge in Long Video Captioning
by: Yang, Zhantao, et al.
Published: (2025)
by: Yang, Zhantao, et al.
Published: (2025)
Baseline Method of the Foundation Model Challenge for Ultrasound Image Analysis
by: Deng, Bo, et al.
Published: (2026)
by: Deng, Bo, et al.
Published: (2026)
EgoLCD: Egocentric Video Generation with Long Context Diffusion
by: Zhang, Liuzhou, et al.
Published: (2025)
by: Zhang, Liuzhou, et al.
Published: (2025)
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
by: Lu, Jiachen, et al.
Published: (2023)
by: Lu, Jiachen, et al.
Published: (2023)
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
by: Xia, Yifei, et al.
Published: (2025)
by: Xia, Yifei, et al.
Published: (2025)
Versatile Transition Generation with Image-to-Video Diffusion
by: Yang, Zuhao, et al.
Published: (2025)
by: Yang, Zuhao, et al.
Published: (2025)
SurgLQA: Scalable Long-Horizon Surgical Video Question Answering
by: Guo, Diandian, et al.
Published: (2026)
by: Guo, Diandian, et al.
Published: (2026)
NTIRE 2026 Challenge on Short-form UGC Video Restoration in the Wild with Generative Models: Datasets, Methods and Results
by: Li, Xin, et al.
Published: (2026)
by: Li, Xin, et al.
Published: (2026)
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
by: Wu, Weijia, et al.
Published: (2024)
by: Wu, Weijia, et al.
Published: (2024)
Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation
by: Li, Ruibin, et al.
Published: (2026)
by: Li, Ruibin, et al.
Published: (2026)
Remote Sensing Image Dehazing: A Systematic Review of Progress, Challenges, and Prospects
by: Zhou, Heng, et al.
Published: (2026)
by: Zhou, Heng, et al.
Published: (2026)
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
by: Wang, ZiDong, et al.
Published: (2024)
by: Wang, ZiDong, et al.
Published: (2024)
Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video
by: Pan, Zijie, et al.
Published: (2024)
by: Pan, Zijie, et al.
Published: (2024)
On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection
by: Song, Xiufeng, et al.
Published: (2024)
by: Song, Xiufeng, et al.
Published: (2024)
Similar Items
-
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights
by: Lei, Wentao, et al.
Published: (2024) -
A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects
by: Xie, Guohuan, et al.
Published: (2025) -
Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
by: Yang, Huan, et al.
Published: (2024) -
LongLive: Real-time Interactive Long Video Generation
by: Yang, Shuai, et al.
Published: (2025) -
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
by: Lu, Zhihe, et al.
Published: (2023)