:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dou, Weijia, Zheng, Wenzhao, Chen, Weiliang, Zheng, Yu, Zhou, Jie, Lu, Jiwen
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.19048
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GenWorld: Towards Detecting AI-generated Real-world Simulation Videos
by: Chen, Weiliang, et al.
Published: (2025)

Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
by: Wu, Yuqi, et al.
Published: (2025)

SpectralAR: Spectral Autoregressive Visual Generation
by: Huang, Yuanhui, et al.
Published: (2025)

Moaw: Unleashing Motion Awareness for Video Diffusion Models
by: Zhang, Tianqi, et al.
Published: (2026)

Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
by: Zhang, Yanran, et al.
Published: (2025)

Terra: Explorable Native 3D World Model with Point Latents
by: Huang, Yuanhui, et al.
Published: (2025)

Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning
by: Li, Yifei, et al.
Published: (2025)

GlobalMamba: Global Image Serialization for Vision Mamba
by: Wang, Chengkun, et al.
Published: (2024)

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
by: Huang, Yuanhui, et al.
Published: (2024)

$\bf{D^3}$QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection
by: Zhang, Yanran, et al.
Published: (2025)

Owl-1: Omni World Model for Consistent Long Video Generation
by: Huang, Yuanhui, et al.
Published: (2024)

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection
by: Zhang, Yanran, et al.
Published: (2026)

Path Choice Matters for Clear Attribution in Path Methods
by: Zhang, Borui, et al.
Published: (2024)

Preventing Local Pitfalls in Vector Quantization via Optimal Transport
by: Zhang, Borui, et al.
Published: (2024)

GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
by: Zuo, Sicheng, et al.
Published: (2024)

NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization
by: Li, Yifei, et al.
Published: (2025)

SceneCompleter: Dense 3D Scene Completion for Generative Novel View Synthesis
by: Chen, Weiliang, et al.
Published: (2025)

OGGSplat: Open Gaussian Growing for Generalizable Reconstruction with Expanded Field-of-View
by: Wang, Yanbo, et al.
Published: (2025)

V2M: Visual 2-Dimensional Mamba for Image Representation Learning
by: Wang, Chengkun, et al.
Published: (2024)

R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation
by: Xu, Xiuwei, et al.
Published: (2025)

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding
by: Zhuo, Dong, et al.
Published: (2026)

SFTok: Bridging the Performance Gap in Discrete Tokenizers
by: Rao, Qihang, et al.
Published: (2025)

Quantize-then-Rectify: Efficient VQ-VAE Training
by: Zhang, Borui, et al.
Published: (2025)

Fast Shapley Value Estimation: A Unified Approach
by: Zhang, Borui, et al.
Published: (2023)

Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection
by: Zeng, Shuai, et al.
Published: (2024)

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
by: Wu, Yuqi, et al.
Published: (2024)

QuadricFormer: Scene as Superquadrics for 3D Semantic Occupancy Prediction
by: Zuo, Sicheng, et al.
Published: (2025)

Streaming 4D Visual Geometry Transformer
by: Zhuo, Dong, et al.
Published: (2025)

Vega: Learning to Drive with Natural Language Instructions
by: Zuo, Sicheng, et al.
Published: (2026)

3D Small Object Detection with Dynamic Spatial Pruning
by: Xu, Xiuwei, et al.
Published: (2023)

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
by: Wang, Lening, et al.
Published: (2024)

Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model
by: Wang, Lening, et al.
Published: (2024)

GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting
by: Dong, Jiajun, et al.
Published: (2025)

LiDAR-HMR: 3D Human Mesh Recovery from LiDAR
by: Fan, Bohao, et al.
Published: (2023)

Doe-1: Closed-Loop Autonomous Driving with Large World Model
by: Zheng, Wenzhao, et al.
Published: (2024)

GaussianFormer-2: Probabilistic Gaussian Superposition for Efficient 3D Occupancy Prediction
by: Huang, Yuanhui, et al.
Published: (2024)

Learning Counterfactually Decoupled Attention for Open-World Model Attribution
by: Zheng, Yu, et al.
Published: (2025)

Astra: General Interactive World Model with Autoregressive Denoising
by: Zhu, Yixuan, et al.
Published: (2025)

OPONeRF: One-Point-One NeRF for Robust Neural Rendering
by: Zheng, Yu, et al.
Published: (2024)

DreamCinema: Cinematic Transfer with Free Camera and 3D Character
by: Chen, Weiliang, et al.
Published: (2024)