:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Kunhao, Shao, Ling, Lu, Shijian
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.14208
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Rolling Forcing: Autoregressive Long Video Diffusion in Real Time
by: Liu, Kunhao, et al.
Published: (2025)

StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
by: Liu, Kunhao, et al.
Published: (2024)

L3DR: 3D-aware LiDAR Diffusion and Rectification
by: Liu, Quan, et al.
Published: (2026)

MuSASplat: Efficient Sparse-View 3D Gaussian Splats via Lightweight Multi-Scale Adaptation
by: Xu, Muyu, et al.
Published: (2025)

OrbitNVS: Harnessing Video Diffusion Priors for Novel View Synthesis
by: Liang, Jinglin, et al.
Published: (2026)

DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception
by: Jiang, Kai, et al.
Published: (2024)

UrbanCraft: Urban View Extrapolation via Hierarchical Sem-Geometric Priors
by: Wang, Tianhang, et al.
Published: (2025)

A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models
by: Li, Duo, et al.
Published: (2025)

Multimodal 3D Reasoning Segmentation with Complex Scenes
by: Jiang, Xueying, et al.
Published: (2024)

DivAvatar: Diverse 3D Avatar Generation with a Single Prompt
by: Tao, Weijing, et al.
Published: (2024)

A Survey of Label-Efficient Deep Learning for 3D Point Clouds
by: Xiao, Aoran, et al.
Published: (2023)

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting
by: Zhang, Jiahui, et al.
Published: (2025)

STS-Mixer: Spatio-Temporal-Spectral Mixer for 4D Point Cloud Video Understanding
by: Li, Wenhao, et al.
Published: (2026)

UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers
by: Zhao, Min, et al.
Published: (2025)

Versatile Transition Generation with Image-to-Video Diffusion
by: Yang, Zuhao, et al.
Published: (2025)

Diffusion Priors for Dynamic View Synthesis from Monocular Videos
by: Wang, Chaoyang, et al.
Published: (2024)

MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders
by: Jiang, Xueying, et al.
Published: (2024)

PCR-GS: COLMAP-Free 3D Gaussian Splatting via Pose Co-Regularizations
by: Wei, Yu, et al.
Published: (2025)

ToDRE: Effective Visual Token Pruning via Token Diversity and Task Relevance
by: Li, Duo, et al.
Published: (2025)

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors
by: Hwang, Sungwon, et al.
Published: (2024)

Direction-aware 3D Large Multimodal Models
by: Liu, Quan, et al.
Published: (2026)

Spatial Preference Rewarding for MLLMs Spatial Understanding
by: Qiu, Han, et al.
Published: (2025)

Exploring 3D Reasoning-Driven Planning: From Implicit Human Intentions to Route-Aware Activity Planning
by: Jiang, Xueying, et al.
Published: (2025)

Drive-1-to-3: Enriching Diffusion Priors for Novel View Synthesis of Real Vehicles
by: Lin, Chuang, et al.
Published: (2024)

NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer
by: You, Meng, et al.
Published: (2024)

ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
by: Yu, Wangbo, et al.
Published: (2024)

Learning Temporally Consistent Video Depth from Video Diffusion Priors
by: Shao, Jiahao, et al.
Published: (2024)

Weakly Supervised 3D Open-vocabulary Segmentation
by: Liu, Kunhao, et al.
Published: (2023)

Weakly Supervised Monocular 3D Detection with a Single-View Image
by: Jiang, Xueying, et al.
Published: (2024)

Next-Frame Decoding for Ultra-Low-Bitrate Image Compression with Video Diffusion Priors
by: Chen, Yunuo, et al.
Published: (2026)

Rewrite Caption Semantics: Bridging Semantic Gaps for Language-Supervised Semantic Segmentation
by: Xing, Yun, et al.
Published: (2023)

Uni-Classifier: Leveraging Video Diffusion Priors for Universal Guidance Classifier
by: Zhou, Yujie, et al.
Published: (2026)

FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation
by: Teng, Wenbin, et al.
Published: (2025)

GeoNVS: Geometry Grounded Video Diffusion for Novel View Synthesis
by: Kang, Minjun, et al.
Published: (2026)

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers
by: Zhao, Min, et al.
Published: (2025)

Historical Test-time Prompt Tuning for Vision Foundation Models
by: Zhang, Jingyi, et al.
Published: (2024)

ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models
by: Shih, Meng-Li, et al.
Published: (2024)

Coding-Prior Guided Diffusion Network for Video Deblurring
by: Liu, Yike, et al.
Published: (2025)

How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024)

TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding
by: Yang, Zuhao, et al.
Published: (2025)