Saved in:
| Main Authors: | Song, Selena, Xu, Ziming, Zhang, Zijun, Zhou, Kun, Guo, Jiaxian, Qin, Lianhui, Huang, Biwei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.19229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction
by: Huang, Shuwei, et al.
Published: (2026)
by: Huang, Shuwei, et al.
Published: (2026)
Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation
by: Liu, Yuhan, et al.
Published: (2025)
by: Liu, Yuhan, et al.
Published: (2025)
Plug-and-play Diffusion Models for Image Compressive Sensing with Data Consistency Projection
by: Wang, Xiaodong, et al.
Published: (2025)
by: Wang, Xiaodong, et al.
Published: (2025)
HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models
by: Xie, Xin, et al.
Published: (2026)
by: Xie, Xin, et al.
Published: (2026)
Enabling Versatile Controls for Video Diffusion Models
by: Zhang, Xu, et al.
Published: (2025)
by: Zhang, Xu, et al.
Published: (2025)
HiGS: History-Guided Sampling for Plug-and-Play Enhancement of Diffusion Models
by: Sadat, Seyedmorteza, et al.
Published: (2025)
by: Sadat, Seyedmorteza, et al.
Published: (2025)
Auto-scaling Continuous Memory for GUI Agent
by: Wu, Wenyi, et al.
Published: (2025)
by: Wu, Wenyi, et al.
Published: (2025)
LLMC+: Benchmarking Vision-Language Model Compression with a Plug-and-play Toolkit
by: Lv, Chengtao, et al.
Published: (2025)
by: Lv, Chengtao, et al.
Published: (2025)
Do Diffusion Models Learn Semantically Meaningful and Efficient Representations?
by: Liang, Qiyao, et al.
Published: (2024)
by: Liang, Qiyao, et al.
Published: (2024)
GVD: Guiding Video Diffusion Model for Scalable Video Distillation
by: Li, Kunyang, et al.
Published: (2025)
by: Li, Kunyang, et al.
Published: (2025)
How Diffusion Models Learn to Factorize and Compose
by: Liang, Qiyao, et al.
Published: (2024)
by: Liang, Qiyao, et al.
Published: (2024)
TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models
by: Li, Pengxiang, et al.
Published: (2023)
by: Li, Pengxiang, et al.
Published: (2023)
A Dataset Generation Scheme Based on Video2EEG-SPGN-Diffusion for SEED-VD
by: Guo, Yunfei, et al.
Published: (2025)
by: Guo, Yunfei, et al.
Published: (2025)
Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models
by: Yang, Qinyu, et al.
Published: (2024)
by: Yang, Qinyu, et al.
Published: (2024)
HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization
by: Zhou, Guanglin, et al.
Published: (2024)
by: Zhou, Guanglin, et al.
Published: (2024)
Scaling the Long Video Understanding of Multimodal Large Language Models via Visual Memory Mechanism
by: Chen, Tao, et al.
Published: (2026)
by: Chen, Tao, et al.
Published: (2026)
Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model
by: Yin, Junhui, et al.
Published: (2026)
by: Yin, Junhui, et al.
Published: (2026)
SVGDreamer: Text Guided SVG Generation with Diffusion Model
by: Xing, Ximing, et al.
Published: (2023)
by: Xing, Ximing, et al.
Published: (2023)
VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control
by: Bian, Yuxuan, et al.
Published: (2025)
by: Bian, Yuxuan, et al.
Published: (2025)
Timeline and Boundary Guided Diffusion Network for Video Shadow Detection
by: Zhou, Haipeng, et al.
Published: (2024)
by: Zhou, Haipeng, et al.
Published: (2024)
TS-P$^2$CL: Plug-and-Play Dual Contrastive Learning for Vision-Guided Medical Time Series Classification
by: Xu, Qi'ao, et al.
Published: (2025)
by: Xu, Qi'ao, et al.
Published: (2025)
Contextualized Diffusion Models for Text-Guided Image and Video Generation
by: Yang, Ling, et al.
Published: (2024)
by: Yang, Ling, et al.
Published: (2024)
Memory-V2V: Memory-Augmented Video-to-Video Diffusion for Consistent Multi-Turn Editing
by: Lee, Dohun, et al.
Published: (2026)
by: Lee, Dohun, et al.
Published: (2026)
Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction
by: Zhou, Zihan, et al.
Published: (2025)
by: Zhou, Zihan, et al.
Published: (2025)
UnGuide: Learning to Forget with LoRA-Guided Diffusion Models
by: Polowczyk, Agnieszka, et al.
Published: (2025)
by: Polowczyk, Agnieszka, et al.
Published: (2025)
Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning
by: Zhou, Guanglin, et al.
Published: (2024)
by: Zhou, Guanglin, et al.
Published: (2024)
DreamSAC: Learning Hamiltonian World Models via Symmetry Exploration
by: Tang, Jinzhou, et al.
Published: (2026)
by: Tang, Jinzhou, et al.
Published: (2026)
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide
by: Lee, Dohun, et al.
Published: (2024)
by: Lee, Dohun, et al.
Published: (2024)
PADS: Plug-and-Play 3D Human Pose Analysis via Diffusion Generative Modeling
by: Ji, Haorui, et al.
Published: (2024)
by: Ji, Haorui, et al.
Published: (2024)
Fast and Memory-Efficient Video Diffusion Using Streamlined Inference
by: Zhan, Zheng, et al.
Published: (2024)
by: Zhan, Zheng, et al.
Published: (2024)
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
by: Zheng, Longtao, et al.
Published: (2024)
by: Zheng, Longtao, et al.
Published: (2024)
Learning Spatiotemporal Sensitivity in Video LLMs via Counterfactual Reinforcement Learning
by: Du, Dazhao, et al.
Published: (2026)
by: Du, Dazhao, et al.
Published: (2026)
Coherent Video Inpainting Using Optical Flow-Guided Efficient Diffusion
by: Gu, Bohai, et al.
Published: (2024)
by: Gu, Bohai, et al.
Published: (2024)
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
by: Xing, Ximing, et al.
Published: (2023)
by: Xing, Ximing, et al.
Published: (2023)
FAIRT2V: Training-Free Debiasing for Text-to-Video Diffusion Models
by: Zhong, Haonan, et al.
Published: (2026)
by: Zhong, Haonan, et al.
Published: (2026)
VideoEraser: Concept Erasure in Text-to-Video Diffusion Models
by: Xu, Naen, et al.
Published: (2025)
by: Xu, Naen, et al.
Published: (2025)
Enhancing targeted transferability via feature space fine-tuning
by: Zeng, Hui, et al.
Published: (2024)
by: Zeng, Hui, et al.
Published: (2024)
Pack and Force Your Memory: Long-form and Consistent Video Generation
by: Wu, Xiaofei, et al.
Published: (2025)
by: Wu, Xiaofei, et al.
Published: (2025)
Real-World Robot Applications of Foundation Models: A Review
by: Kawaharazuka, Kento, et al.
Published: (2024)
by: Kawaharazuka, Kento, et al.
Published: (2024)
MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling
by: Zhang, Bowen, et al.
Published: (2024)
by: Zhang, Bowen, et al.
Published: (2024)
Similar Items
-
LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction
by: Huang, Shuwei, et al.
Published: (2026) -
Small Drafts, Big Verdict: Information-Intensive Visual Reasoning via Speculation
by: Liu, Yuhan, et al.
Published: (2025) -
Plug-and-play Diffusion Models for Image Compressive Sensing with Data Consistency Projection
by: Wang, Xiaodong, et al.
Published: (2025) -
HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models
by: Xie, Xin, et al.
Published: (2026) -
Enabling Versatile Controls for Video Diffusion Models
by: Zhang, Xu, et al.
Published: (2025)