:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	He, Haoran, Zhang, Yang, Lin, Liang, Xu, Zhongwen, Pan, Ling
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2502.07825
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Scaling Image and Video Generation via Test-Time Evolutionary Search
di: He, Haoran, et al.
Pubblicazione: (2025)

Rethinking Video Generation Model for the Embodied World
di: Deng, Yufan, et al.
Pubblicazione: (2026)

Luminark: Training-free, Probabilistically-Certified Watermarking for General Vision Generative Models
di: Xu, Jiayi, et al.
Pubblicazione: (2026)

WorldModelBench: Judging Video Generation Models As World Models
di: Li, Dacheng, et al.
Pubblicazione: (2025)

Physical Simulator In-the-Loop Video Generation
di: Foo, Lin Geng, et al.
Pubblicazione: (2026)

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
di: Seawead, Team, et al.
Pubblicazione: (2025)

How Far is Video Generation from World Model: A Physical Law Perspective
di: Kang, Bingyi, et al.
Pubblicazione: (2024)

S2DM: Sector-Shaped Diffusion Models for Video Generation
di: Lang, Haoran, et al.
Pubblicazione: (2024)

Medical Video Generation for Disease Progression Simulation
di: Cao, Xu, et al.
Pubblicazione: (2024)

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
di: Shi, Fengyuan, et al.
Pubblicazione: (2023)

ShareVerse: Multi-Agent Consistent Video Generation for Shared World Modeling
di: Zhu, Jiayi, et al.
Pubblicazione: (2026)

EditWorld: Simulating World Dynamics for Instruction-Following Image Editing
di: Yang, Ling, et al.
Pubblicazione: (2024)

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
di: Inferix Team, et al.
Pubblicazione: (2025)

Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey
di: Fu, Ao, et al.
Pubblicazione: (2024)

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
di: Wang, Ye, et al.
Pubblicazione: (2025)

RISE-Video: Can Video Generators Decode Implicit World Rules?
di: Liu, Mingxin, et al.
Pubblicazione: (2026)

A Mechanistic View on Video Generation as World Models: State and Dynamics
di: Wang, Luozhou, et al.
Pubblicazione: (2026)

World Simulation with Video Foundation Models for Physical AI
di: NVIDIA, et al.
Pubblicazione: (2025)

HARIVO: Harnessing Text-to-Image Models for Video Generation
di: Kwon, Mingi, et al.
Pubblicazione: (2024)

Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild
di: Wen, Zhuofan, et al.
Pubblicazione: (2024)

VRAG: Learning World Models for Interactive Video Generation
di: Chen, Taiye, et al.
Pubblicazione: (2025)

Refining Pre-Trained Motion Models
di: Sun, Xinglong, et al.
Pubblicazione: (2024)

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories
di: Wang, Zun, et al.
Pubblicazione: (2026)

Pre-Trained Vision-Language Models as Partial Annotators
di: Wang, Qian-Wei, et al.
Pubblicazione: (2024)

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
di: He, Xuan, et al.
Pubblicazione: (2024)

HQA-VLAttack: Towards High Quality Adversarial Attack on Vision-Language Pre-Trained Models
di: Liu, Han, et al.
Pubblicazione: (2026)

Efficient Training of Large Vision Models via Advanced Automated Progressive Learning
di: Li, Changlin, et al.
Pubblicazione: (2024)

Pre-Trained CNN Architecture for Transformer-Based Image Caption Generation Model
di: Dufera, Amanuel Tafese
Pubblicazione: (2025)

GPT-NAS: Evolutionary Neural Architecture Search with the Generative Pre-Trained Model
di: Yu, Caiyang, et al.
Pubblicazione: (2023)

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
di: Lin, Shanchuan, et al.
Pubblicazione: (2025)

Contextualized Diffusion Models for Text-Guided Image and Video Generation
di: Yang, Ling, et al.
Pubblicazione: (2024)

Open-Sora Plan: Open-Source Large Video Generation Model
di: Lin, Bin, et al.
Pubblicazione: (2024)

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
di: Li, Yifei, et al.
Pubblicazione: (2025)

Efficient Training for Human Video Generation with Entropy-Guided Prioritized Progressive Learning
di: Li, Changlin, et al.
Pubblicazione: (2025)

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
di: Fan, Jiawei, et al.
Pubblicazione: (2026)

EgoSim: Egocentric World Simulator for Embodied Interaction Generation
di: Hao, Jinkun, et al.
Pubblicazione: (2026)

D3: Training-Free AI-Generated Video Detection Using Second-Order Features
di: Zheng, Chende, et al.
Pubblicazione: (2025)

Simulating the Real World: A Unified Survey of Multimodal Generative Models
di: Hu, Yuqi, et al.
Pubblicazione: (2025)

Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
di: Wu, Haoyu, et al.
Pubblicazione: (2025)

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
di: Fang, Ye, et al.
Pubblicazione: (2025)