Saved in:
| Main Authors: | Chen, Shengfu, Liu, Hailong, Wei, Wenzhao |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.07194 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
by: Wang, Lening, et al.
Published: (2024)
by: Wang, Lening, et al.
Published: (2024)
FFA Sora, video generation as fundus fluorescein angiography simulator
by: Wu, Xinyuan, et al.
Published: (2024)
by: Wu, Xinyuan, et al.
Published: (2024)
RobustSora: De-Watermarked Benchmark for Robust AI-Generated Video Detection
by: Wang, Zhuo, et al.
Published: (2025)
by: Wang, Zhuo, et al.
Published: (2025)
Open-Sora Plan: Open-Source Large Video Generation Model
by: Lin, Bin, et al.
Published: (2024)
by: Lin, Bin, et al.
Published: (2024)
From Sora What We Can See: A Survey of Text-to-Video Generation
by: Sun, Rui, et al.
Published: (2024)
by: Sun, Rui, et al.
Published: (2024)
Ovis-Image Technical Report
by: Wang, Guo-Hua, et al.
Published: (2025)
by: Wang, Guo-Hua, et al.
Published: (2025)
Qwen3-VL Technical Report
by: Bai, Shuai, et al.
Published: (2025)
by: Bai, Shuai, et al.
Published: (2025)
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
Ovis-U1 Technical Report
by: Wang, Guo-Hua, et al.
Published: (2025)
by: Wang, Guo-Hua, et al.
Published: (2025)
Baichuan-Omni Technical Report
by: Li, Yadong, et al.
Published: (2024)
by: Li, Yadong, et al.
Published: (2024)
Seed1.5-VL Technical Report
by: Guo, Dong, et al.
Published: (2025)
by: Guo, Dong, et al.
Published: (2025)
SurgSora: Object-Aware Diffusion Model for Controllable Surgical Video Generation
by: Chen, Tong, et al.
Published: (2024)
by: Chen, Tong, et al.
Published: (2024)
HunyuanOCR Technical Report
by: Hunyuan Vision Team, et al.
Published: (2025)
by: Hunyuan Vision Team, et al.
Published: (2025)
Dolphin v1.0 Technical Report
by: Weng, Taohan, et al.
Published: (2025)
by: Weng, Taohan, et al.
Published: (2025)
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset
by: Dai, Josef, et al.
Published: (2024)
by: Dai, Josef, et al.
Published: (2024)
GR-3 Technical Report
by: Cheang, Chilam, et al.
Published: (2025)
by: Cheang, Chilam, et al.
Published: (2025)
Technical Approach for the EMI Challenge in the 8th Affective Behavior Analysis in-the-Wild Competition
by: Yu, Jun, et al.
Published: (2025)
by: Yu, Jun, et al.
Published: (2025)
MASSeg : 2nd Technical Report for 4th PVUW MOSE Track
by: Cao, Xuqiang, et al.
Published: (2025)
by: Cao, Xuqiang, et al.
Published: (2025)
AEMIM: Adversarial Examples Meet Masked Image Modeling
by: Xiang, Wenzhao, et al.
Published: (2024)
by: Xiang, Wenzhao, et al.
Published: (2024)
MedGemma Technical Report
by: Sellergren, Andrew, et al.
Published: (2025)
by: Sellergren, Andrew, et al.
Published: (2025)
WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs
by: Yang, Deshun, et al.
Published: (2024)
by: Yang, Deshun, et al.
Published: (2024)
Ovis2.5 Technical Report
by: Lu, Shiyin, et al.
Published: (2025)
by: Lu, Shiyin, et al.
Published: (2025)
MARS: Technical Report for the CASTLE Challenge at EgoVis 2026
by: Zhang, Haoyu, et al.
Published: (2026)
by: Zhang, Haoyu, et al.
Published: (2026)
Motif-Video 2B: Technical Report
by: Lim, Junghwan, et al.
Published: (2026)
by: Lim, Junghwan, et al.
Published: (2026)
MonkeyOCR v1.5 Technical Report: Unlocking Robust Document Parsing for Complex Patterns
by: Zhang, Jiarui, et al.
Published: (2025)
by: Zhang, Jiarui, et al.
Published: (2025)
ZAYA1-VL-8B Technical Report
by: Shapourian, Hassan, et al.
Published: (2026)
by: Shapourian, Hassan, et al.
Published: (2026)
PLaMo 2.1-VL Technical Report
by: Kerola, Tommi, et al.
Published: (2026)
by: Kerola, Tommi, et al.
Published: (2026)
AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model
by: Jin, Zhiwei, et al.
Published: (2025)
by: Jin, Zhiwei, et al.
Published: (2025)
Gender Bias in Text-to-Video Generation Models: A case study of Sora
by: Nadeem, Mohammad, et al.
Published: (2024)
by: Nadeem, Mohammad, et al.
Published: (2024)
Sora as a World Model? A Complete Survey on Text-to-Video Generation
by: Puspitasari, Fachrina Dewi, et al.
Published: (2024)
by: Puspitasari, Fachrina Dewi, et al.
Published: (2024)
PhysBrain 1.0 Technical Report
by: Lian, Shijie, et al.
Published: (2026)
by: Lian, Shijie, et al.
Published: (2026)
Phi-4-reasoning-vision-15B Technical Report
by: Aneja, Jyoti, et al.
Published: (2026)
by: Aneja, Jyoti, et al.
Published: (2026)
Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025
by: Chu, Qiaohui, et al.
Published: (2025)
by: Chu, Qiaohui, et al.
Published: (2025)
iFlyBot-VLA Technical Report
by: Zhang, Yuan, et al.
Published: (2025)
by: Zhang, Yuan, et al.
Published: (2025)
GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting
by: Dong, Jiajun, et al.
Published: (2025)
by: Dong, Jiajun, et al.
Published: (2025)
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation
by: Chu, Zedong, et al.
Published: (2026)
by: Chu, Zedong, et al.
Published: (2026)
Phoenix-VL 1.5 Medium Technical Report
by: Phoenix, Team, et al.
Published: (2026)
by: Phoenix, Team, et al.
Published: (2026)
Pegasus-v1 Technical Report
by: Jung, Raehyuk, et al.
Published: (2024)
by: Jung, Raehyuk, et al.
Published: (2024)
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
by: Huang, Yuanhui, et al.
Published: (2024)
by: Huang, Yuanhui, et al.
Published: (2024)
HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
by: Qiu, Wenzhao, et al.
Published: (2024)
by: Qiu, Wenzhao, et al.
Published: (2024)
Similar Items
-
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
by: Wang, Lening, et al.
Published: (2024) -
FFA Sora, video generation as fundus fluorescein angiography simulator
by: Wu, Xinyuan, et al.
Published: (2024) -
RobustSora: De-Watermarked Benchmark for Robust AI-Generated Video Detection
by: Wang, Zhuo, et al.
Published: (2025) -
Open-Sora Plan: Open-Source Large Video Generation Model
by: Lin, Bin, et al.
Published: (2024) -
From Sora What We Can See: A Survey of Text-to-Video Generation
by: Sun, Rui, et al.
Published: (2024)