Saved in:
| Main Authors: | Wang, Qisen, Zhao, Yifan, Shen, Peisen, Li, Jialu, Li, Jia |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.01481 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WorldTree: Towards 4D Dynamic Worlds from Monocular Video using Tree-Chains
by: Wang, Qisen, et al.
Published: (2026)
by: Wang, Qisen, et al.
Published: (2026)
How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024)
by: Wang, Qisen, et al.
Published: (2024)
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
by: Wu, Jay Zhangjie, et al.
Published: (2025)
by: Wu, Jay Zhangjie, et al.
Published: (2025)
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
by: Duan, Zheng-Peng, et al.
Published: (2025)
by: Duan, Zheng-Peng, et al.
Published: (2025)
Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models
by: Li, Qifan, et al.
Published: (2026)
by: Li, Qifan, et al.
Published: (2026)
DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model
by: Gu, Songen, et al.
Published: (2024)
by: Gu, Songen, et al.
Published: (2024)
Taming Real-World Space-Time Video Super-Resolution with One-Step Diffusion
by: Wei, Shuoyan, et al.
Published: (2026)
by: Wei, Shuoyan, et al.
Published: (2026)
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation
by: Zhou, Haiyang, et al.
Published: (2025)
by: Zhou, Haiyang, et al.
Published: (2025)
ControlSR: Taming Diffusion Models for Consistent Real-World Image Super Resolution
by: Wan, Yuhao, et al.
Published: (2024)
by: Wan, Yuhao, et al.
Published: (2024)
Gradient-Free Classifier Guidance for Diffusion Model Sampling
by: Shenoy, Rahul, et al.
Published: (2024)
by: Shenoy, Rahul, et al.
Published: (2024)
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
by: Song, Ziyang, et al.
Published: (2025)
by: Song, Ziyang, et al.
Published: (2025)
OneWorld: Taming Scene Generation with 3D Unified Representation Autoencoder
by: Gao, Sensen, et al.
Published: (2026)
by: Gao, Sensen, et al.
Published: (2026)
Diffusion-Classifier Synergy: Reward-Aligned Learning via Mutual Boosting Loop for FSCIL
by: Wu, Ruitao, et al.
Published: (2025)
by: Wu, Ruitao, et al.
Published: (2025)
DiP: Taming Diffusion Models in Pixel Space
by: Chen, Zhennan, et al.
Published: (2025)
by: Chen, Zhennan, et al.
Published: (2025)
SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion
by: Li, Jialu, et al.
Published: (2026)
by: Li, Jialu, et al.
Published: (2026)
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding
by: Shi, Bowen, et al.
Published: (2024)
by: Shi, Bowen, et al.
Published: (2024)
ChronoTailor: Harnessing Attention Guidance for Fine-Grained Video Virtual Try-On
by: Wang, Jinjuan, et al.
Published: (2025)
by: Wang, Jinjuan, et al.
Published: (2025)
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
by: Feng, Yutong, et al.
Published: (2023)
by: Feng, Yutong, et al.
Published: (2023)
ArtHOI: Taming Foundation Models for Monocular 4D Reconstruction of Hand-Articulated-Object Interactions
by: Wang, Zikai, et al.
Published: (2026)
by: Wang, Zikai, et al.
Published: (2026)
Taming Diffusion Models for Image Restoration: A Review
by: Luo, Ziwei, et al.
Published: (2024)
by: Luo, Ziwei, et al.
Published: (2024)
OCCDiff: Occupancy Diffusion Model for High-Fidelity 3D Building Reconstruction from Noisy Point Clouds
by: Sui, Jialu, et al.
Published: (2025)
by: Sui, Jialu, et al.
Published: (2025)
Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs
by: Cui, Qinpeng, et al.
Published: (2024)
by: Cui, Qinpeng, et al.
Published: (2024)
VSDiffusion: Taming Ill-Posed Shadow Generation via Visibility-Constrained Diffusion
by: Li, Jing, et al.
Published: (2026)
by: Li, Jing, et al.
Published: (2026)
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
by: Zhou, Yang, et al.
Published: (2025)
by: Zhou, Yang, et al.
Published: (2025)
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
by: Bahmani, Sherwin, et al.
Published: (2024)
by: Bahmani, Sherwin, et al.
Published: (2024)
Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation
by: Meng, Xuyi, et al.
Published: (2025)
by: Meng, Xuyi, et al.
Published: (2025)
Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs
by: Zhong, Yingji, et al.
Published: (2025)
by: Zhong, Yingji, et al.
Published: (2025)
Tango: Taming Visual Signals for Efficient Video Large Language Models
by: Yin, Shukang, et al.
Published: (2026)
by: Yin, Shukang, et al.
Published: (2026)
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
by: Wang, Qinghe, et al.
Published: (2024)
by: Wang, Qinghe, et al.
Published: (2024)
Taming Video Models for 3D and 4D Generation via Zero-Shot Camera Control
by: Song, Chenxi, et al.
Published: (2025)
by: Song, Chenxi, et al.
Published: (2025)
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models
by: Jin, Yudong, et al.
Published: (2025)
by: Jin, Yudong, et al.
Published: (2025)
MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction
by: Zou, Yingshuang, et al.
Published: (2025)
by: Zou, Yingshuang, et al.
Published: (2025)
Taming Diffusion for Dataset Distillation with High Representativeness
by: Zhao, Lin, et al.
Published: (2025)
by: Zhao, Lin, et al.
Published: (2025)
HABIT: Chrono-Synergia Robust Progressive Learning Framework for Composed Image Retrieval
by: Li, Zixu, et al.
Published: (2026)
by: Li, Zixu, et al.
Published: (2026)
Learning Yourself: Class-Incremental Semantic Segmentation with Language-Inspired Bootstrapped Disentanglement
by: Wu, Ruitao, et al.
Published: (2025)
by: Wu, Ruitao, et al.
Published: (2025)
Parsing Objects at a Finer Granularity: A Survey
by: Zhao, Yifan, et al.
Published: (2022)
by: Zhao, Yifan, et al.
Published: (2022)
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
by: Yu, Wangbo, et al.
Published: (2024)
by: Yu, Wangbo, et al.
Published: (2024)
Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning
by: Chen, Chubin, et al.
Published: (2025)
by: Chen, Chubin, et al.
Published: (2025)
You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs
by: Luo, Yihong, et al.
Published: (2024)
by: Luo, Yihong, et al.
Published: (2024)
Free Lunch to Meet the Gap: Intermediate Domain Reconstruction for Cross-Domain Few-Shot Learning
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
Similar Items
-
WorldTree: Towards 4D Dynamic Worlds from Monocular Video using Tree-Chains
by: Wang, Qisen, et al.
Published: (2026) -
How to Use Diffusion Priors under Sparse Views?
by: Wang, Qisen, et al.
Published: (2024) -
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
by: Wu, Jay Zhangjie, et al.
Published: (2025) -
DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution
by: Duan, Zheng-Peng, et al.
Published: (2025) -
Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models
by: Li, Qifan, et al.
Published: (2026)