Saved in:
| Main Authors: | Sun, Huiqiang, Shen, Liao, Peng, Zhan, Wang, Kun, Wu, Size, Zang, Yuhang, Liu, Tianqi, Huang, Zihao, Zeng, Xingyu, Cao, Zhiguo, Li, Wei, Loy, Chen Change |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.12921 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
by: Shen, Liao, et al.
Published: (2025)
by: Shen, Liao, et al.
Published: (2025)
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
by: Liu, Tianqi, et al.
Published: (2025)
by: Liu, Tianqi, et al.
Published: (2025)
MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction
by: Lou, Yaopeng, et al.
Published: (2025)
by: Lou, Yaopeng, et al.
Published: (2025)
3D Multi-frame Fusion for Video Stabilization
by: Peng, Zhan, et al.
Published: (2024)
by: Peng, Zhan, et al.
Published: (2024)
Controllable Human-centric Keyframe Interpolation with Generative Prior
by: Guo, Zujin, et al.
Published: (2025)
by: Guo, Zujin, et al.
Published: (2025)
Dynamic Neural Radiance Field From Defocused Monocular Video
by: Luo, Xianrui, et al.
Published: (2024)
by: Luo, Xianrui, et al.
Published: (2024)
Dynamic View Synthesis from Small Camera Motion Videos
by: Sun, Huiqiang, et al.
Published: (2025)
by: Sun, Huiqiang, et al.
Published: (2025)
DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion
by: Shen, Liao, et al.
Published: (2024)
by: Shen, Liao, et al.
Published: (2024)
Contextual Object Detection with Multimodal Large Language Models
by: Zang, Yuhang, et al.
Published: (2023)
by: Zang, Yuhang, et al.
Published: (2023)
DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video
by: Sun, Huiqiang, et al.
Published: (2024)
by: Sun, Huiqiang, et al.
Published: (2024)
Zero-Shot Video Translation and Editing with Frame Spatial-Temporal Correspondence
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
MatAnyone: Stable Video Matting with Consistent Memory Propagation
by: Yang, Peiqing, et al.
Published: (2025)
by: Yang, Peiqing, et al.
Published: (2025)
BokehFlow: Depth-Free Controllable Bokeh Rendering via Flow Matching
by: Huang, Yachuan, et al.
Published: (2025)
by: Huang, Yachuan, et al.
Published: (2025)
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection
by: Xu, Shilin, et al.
Published: (2023)
by: Xu, Shilin, et al.
Published: (2023)
MVIP-NeRF: Multi-view 3D Inpainting on NeRF Scenes via Diffusion Prior
by: Chen, Honghua, et al.
Published: (2024)
by: Chen, Honghua, et al.
Published: (2024)
Arbitrary-steps Image Super-resolution via Diffusion Inversion
by: Yue, Zongsheng, et al.
Published: (2024)
by: Yue, Zongsheng, et al.
Published: (2024)
Generalizable Implicit Motion Modeling for Video Frame Interpolation
by: Guo, Zujin, et al.
Published: (2024)
by: Guo, Zujin, et al.
Published: (2024)
Kalman-Inspired Feature Propagation for Video Face Super-Resolution
by: Feng, Ruicheng, et al.
Published: (2024)
by: Feng, Ruicheng, et al.
Published: (2024)
WildAvatar: Learning In-the-wild 3D Avatars from the Web
by: Huang, Zihao, et al.
Published: (2024)
by: Huang, Zihao, et al.
Published: (2024)
SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control
by: Zhang, Zhida, et al.
Published: (2026)
by: Zhang, Zhida, et al.
Published: (2026)
DifFace: Blind Face Restoration with Diffused Error Contraction
by: Yue, Zongsheng, et al.
Published: (2022)
by: Yue, Zongsheng, et al.
Published: (2022)
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
by: Wu, Size, et al.
Published: (2023)
by: Wu, Size, et al.
Published: (2023)
F-LMM: Grounding Frozen Large Multimodal Models
by: Wu, Size, et al.
Published: (2024)
by: Wu, Size, et al.
Published: (2024)
MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
by: Liu, Tianqi, et al.
Published: (2024)
by: Liu, Tianqi, et al.
Published: (2024)
Exposure Completing for Temporally Consistent Neural High Dynamic Range Video Rendering
by: Cui, Jiahao, et al.
Published: (2024)
by: Cui, Jiahao, et al.
Published: (2024)
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation
by: Liao, Kang, et al.
Published: (2025)
by: Liao, Kang, et al.
Published: (2025)
Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration
by: Liao, Kang, et al.
Published: (2024)
by: Liao, Kang, et al.
Published: (2024)
MOWA: Multiple-in-One Image Warping Model
by: Liao, Kang, et al.
Published: (2024)
by: Liao, Kang, et al.
Published: (2024)
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation
by: Yang, Shuai, et al.
Published: (2024)
by: Yang, Shuai, et al.
Published: (2024)
3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement
by: Luo, Yihang, et al.
Published: (2024)
by: Luo, Yihang, et al.
Published: (2024)
Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields
by: Liu, Tianqi, et al.
Published: (2024)
by: Liu, Tianqi, et al.
Published: (2024)
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
by: Phung, Quynh, et al.
Published: (2025)
by: Phung, Quynh, et al.
Published: (2025)
OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation
by: Wu, Size, et al.
Published: (2025)
by: Wu, Size, et al.
Published: (2025)
ObjCtrl-2.5D: Training-free Object Control with Camera Poses
by: Wang, Zhouxia, et al.
Published: (2024)
by: Wang, Zhouxia, et al.
Published: (2024)
Enhanced Generative Structure Prior for Chinese Text Image Super-resolution
by: Li, Xiaoming, et al.
Published: (2025)
by: Li, Xiaoming, et al.
Published: (2025)
Learning 3D Garment Animation from Trajectories of A Piece of Cloth
by: Shao, Yidi, et al.
Published: (2025)
by: Shao, Yidi, et al.
Published: (2025)
Efficient Diffusion Model for Image Restoration by Residual Shifting
by: Yue, Zongsheng, et al.
Published: (2024)
by: Yue, Zongsheng, et al.
Published: (2024)
AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation
by: Hou, Xinyu, et al.
Published: (2024)
by: Hou, Xinyu, et al.
Published: (2024)
Dual-Camera All-in-Focus Neural Radiance Fields
by: Luo, Xianrui, et al.
Published: (2025)
by: Luo, Xianrui, et al.
Published: (2025)
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
by: Wu, Size, et al.
Published: (2025)
by: Wu, Size, et al.
Published: (2025)
Similar Items
-
DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting
by: Shen, Liao, et al.
Published: (2025) -
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
by: Liu, Tianqi, et al.
Published: (2025) -
MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction
by: Lou, Yaopeng, et al.
Published: (2025) -
3D Multi-frame Fusion for Video Stabilization
by: Peng, Zhan, et al.
Published: (2024) -
Controllable Human-centric Keyframe Interpolation with Generative Prior
by: Guo, Zujin, et al.
Published: (2025)