Saved in:
| Main Authors: | Ling, XuDong, Li, ChaoRong, Qin, FengQing, Zhu, LiHong, Huang, Yuanyuan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.12779 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Precipitation Nowcasting Using Diffusion Transformer with Causal Attention
by: Li, ChaoRong, et al.
Published: (2024)
by: Li, ChaoRong, et al.
Published: (2024)
RNDiff: Rainfall nowcasting with Condition Diffusion Model
by: Ling, Xudong, et al.
Published: (2024)
by: Ling, Xudong, et al.
Published: (2024)
Segment Anything without Supervision
by: Wang, XuDong, et al.
Published: (2024)
by: Wang, XuDong, et al.
Published: (2024)
UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity
by: Yu, Junwei, et al.
Published: (2025)
by: Yu, Junwei, et al.
Published: (2025)
Reconstruction Alignment Improves Unified Multimodal Models
by: Xie, Ji, et al.
Published: (2025)
by: Xie, Ji, et al.
Published: (2025)
Consistency Model is an Effective Posterior Sample Approximation for Diffusion Inverse Solvers
by: Xu, Tongda, et al.
Published: (2024)
by: Xu, Tongda, et al.
Published: (2024)
Extreme Precipitation Nowcasting using Multi-Task Latent Diffusion Models
by: Chaorong, Li, et al.
Published: (2024)
by: Chaorong, Li, et al.
Published: (2024)
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
by: Qin, Yiming, et al.
Published: (2025)
by: Qin, Yiming, et al.
Published: (2025)
Visual Lexicon: Rich Image Features in Language Space
by: Wang, XuDong, et al.
Published: (2024)
by: Wang, XuDong, et al.
Published: (2024)
SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)
by: Wang, XuDong, et al.
Published: (2024)
Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval
by: Li, Shiqi, et al.
Published: (2024)
by: Li, Shiqi, et al.
Published: (2024)
Human detectors are surprisingly powerful reward models
by: Ashutosh, Kumar, et al.
Published: (2026)
by: Ashutosh, Kumar, et al.
Published: (2026)
Two Causally Related Needles in a Video Haystack
by: Li, Miaoyu, et al.
Published: (2025)
by: Li, Miaoyu, et al.
Published: (2025)
Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models
by: Feng, Tongtong, et al.
Published: (2024)
by: Feng, Tongtong, et al.
Published: (2024)
Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
by: Tan, Zhiyu, et al.
Published: (2025)
by: Tan, Zhiyu, et al.
Published: (2025)
Visually Prompted Benchmarks Are Surprisingly Fragile
by: Feng, Haiwen, et al.
Published: (2025)
by: Feng, Haiwen, et al.
Published: (2025)
Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
by: Li, Xinpeng, et al.
Published: (2024)
by: Li, Xinpeng, et al.
Published: (2024)
Reversible Efficient Diffusion for Image Fusion
by: Xu, Xingxin, et al.
Published: (2026)
by: Xu, Xingxin, et al.
Published: (2026)
M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention
by: Panta, Sanjeev, et al.
Published: (2026)
by: Panta, Sanjeev, et al.
Published: (2026)
Constantly Improving Image Models Need Constantly Improving Benchmarks
by: Ge, Jiaxin, et al.
Published: (2025)
by: Ge, Jiaxin, et al.
Published: (2025)
Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter
by: Wang, Jinglong, et al.
Published: (2023)
by: Wang, Jinglong, et al.
Published: (2023)
QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification
by: Feng, Weilun, et al.
Published: (2025)
by: Feng, Weilun, et al.
Published: (2025)
WarmFed: Federated Learning with Warm-Start for Globalization and Personalization Via Personalized Diffusion Models
by: Feng, Tao, et al.
Published: (2025)
by: Feng, Tao, et al.
Published: (2025)
Seeing It Before It Happens: In-Generation NSFW Detection for Diffusion-Based Text-to-Image Models
by: Yang, Fan, et al.
Published: (2025)
by: Yang, Fan, et al.
Published: (2025)
Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model
by: Meng, Jiayang, et al.
Published: (2024)
by: Meng, Jiayang, et al.
Published: (2024)
FreSca: Scaling in Frequency Space Enhances Diffusion Models
by: Huang, Chao, et al.
Published: (2025)
by: Huang, Chao, et al.
Published: (2025)
Learning to Score Sign Language with Two-stage Method
by: Wen, Hongli, et al.
Published: (2024)
by: Wen, Hongli, et al.
Published: (2024)
Improving Long-Text Alignment for Text-to-Image Diffusion Models
by: Liu, Luping, et al.
Published: (2024)
by: Liu, Luping, et al.
Published: (2024)
Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization
by: Chang, Yuanyuan, et al.
Published: (2025)
by: Chang, Yuanyuan, et al.
Published: (2025)
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
by: Yu, Fanghua, et al.
Published: (2025)
by: Yu, Fanghua, et al.
Published: (2025)
Seeing the Unseen: Mask-Driven Positional Encoding and Strip-Convolution Context Modeling for Cross-View Object Geo-Localization
by: Hu, Shuhan, et al.
Published: (2025)
by: Hu, Shuhan, et al.
Published: (2025)
Simplifying DINO via Coding Rate Regularization
by: Wu, Ziyang, et al.
Published: (2025)
by: Wu, Ziyang, et al.
Published: (2025)
FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment
by: Xu, Hang, et al.
Published: (2025)
by: Xu, Hang, et al.
Published: (2025)
MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation
by: Feng, Weilun, et al.
Published: (2025)
by: Feng, Weilun, et al.
Published: (2025)
Unmasking Bias in Diffusion Model Training
by: Yu, Hu, et al.
Published: (2023)
by: Yu, Hu, et al.
Published: (2023)
HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
by: Xie, Zhifeng, et al.
Published: (2024)
by: Xie, Zhifeng, et al.
Published: (2024)
Locate n' Rotate: Two-stage Openable Part Detection with Foundation Model Priors
by: Li, Siqi, et al.
Published: (2024)
by: Li, Siqi, et al.
Published: (2024)
Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation
by: Huang, Yuhang, et al.
Published: (2023)
by: Huang, Yuhang, et al.
Published: (2023)
Chest-Diffusion: A Light-Weight Text-to-Image Model for Report-to-CXR Generation
by: Huang, Peng, et al.
Published: (2024)
by: Huang, Peng, et al.
Published: (2024)
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
by: Hong, Fa-Ting, et al.
Published: (2025)
by: Hong, Fa-Ting, et al.
Published: (2025)
Similar Items
-
Precipitation Nowcasting Using Diffusion Transformer with Causal Attention
by: Li, ChaoRong, et al.
Published: (2024) -
RNDiff: Rainfall nowcasting with Condition Diffusion Model
by: Ling, Xudong, et al.
Published: (2024) -
Segment Anything without Supervision
by: Wang, XuDong, et al.
Published: (2024) -
UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity
by: Yu, Junwei, et al.
Published: (2025) -
Reconstruction Alignment Improves Unified Multimodal Models
by: Xie, Ji, et al.
Published: (2025)