:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ling, XuDong, Li, ChaoRong, Qin, FengQing, Zhu, LiHong, Huang, Yuanyuan
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2402.12779
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Precipitation Nowcasting Using Diffusion Transformer with Causal Attention
by: Li, ChaoRong, et al.
Published: (2024)

RNDiff: Rainfall nowcasting with Condition Diffusion Model
by: Ling, Xudong, et al.
Published: (2024)

Segment Anything without Supervision
by: Wang, XuDong, et al.
Published: (2024)

UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity
by: Yu, Junwei, et al.
Published: (2025)

Reconstruction Alignment Improves Unified Multimodal Models
by: Xie, Ji, et al.
Published: (2025)

Consistency Model is an Effective Posterior Sample Approximation for Diffusion Inverse Solvers
by: Xu, Tongda, et al.
Published: (2024)

Extreme Precipitation Nowcasting using Multi-Task Latent Diffusion Models
by: Chaorong, Li, et al.
Published: (2024)

Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
by: Qin, Yiming, et al.
Published: (2025)

Visual Lexicon: Rich Image Features in Language Space
by: Wang, XuDong, et al.
Published: (2024)

SegLLM: Multi-round Reasoning Segmentation
by: Wang, XuDong, et al.
Published: (2024)

Incremental Multiview Point Cloud Registration with Two-stage Candidate Retrieval
by: Li, Shiqi, et al.
Published: (2024)

Human detectors are surprisingly powerful reward models
by: Ashutosh, Kumar, et al.
Published: (2026)

Two Causally Related Needles in a Video Haystack
by: Li, Miaoyu, et al.
Published: (2025)

Multi-weather Cross-view Geo-localization Using Denoising Diffusion Models
by: Feng, Tongtong, et al.
Published: (2024)

Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos
by: Tan, Zhiyu, et al.
Published: (2025)

Visually Prompted Benchmarks Are Surprisingly Fragile
by: Feng, Haiwen, et al.
Published: (2025)

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
by: Li, Xinpeng, et al.
Published: (2024)

Reversible Efficient Diffusion for Image Fusion
by: Xu, Xingxin, et al.
Published: (2026)

M3R: Localized Rainfall Nowcasting with Meteorology-Informed MultiModal Attention
by: Panta, Sanjeev, et al.
Published: (2026)

Constantly Improving Image Models Need Constantly Improving Benchmarks
by: Ge, Jiaxin, et al.
Published: (2025)

Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter
by: Wang, Jinglong, et al.
Published: (2023)

QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification
by: Feng, Weilun, et al.
Published: (2025)

WarmFed: Federated Learning with Warm-Start for Globalization and Personalization Via Personalized Diffusion Models
by: Feng, Tao, et al.
Published: (2025)

Seeing It Before It Happens: In-Generation NSFW Detection for Diffusion-Based Text-to-Image Models
by: Yang, Fan, et al.
Published: (2025)

Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model
by: Meng, Jiayang, et al.
Published: (2024)

FreSca: Scaling in Frequency Space Enhances Diffusion Models
by: Huang, Chao, et al.
Published: (2025)

Learning to Score Sign Language with Two-stage Method
by: Wen, Hongli, et al.
Published: (2024)

Improving Long-Text Alignment for Text-to-Image Diffusion Models
by: Liu, Luping, et al.
Published: (2024)

Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization
by: Chang, Yuanyuan, et al.
Published: (2025)

UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models
by: Yu, Fanghua, et al.
Published: (2025)

Seeing the Unseen: Mask-Driven Positional Encoding and Strip-Convolution Context Modeling for Cross-View Object Geo-Localization
by: Hu, Shuhan, et al.
Published: (2025)

Simplifying DINO via Coding Rate Regularization
by: Wu, Ziyang, et al.
Published: (2025)

FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment
by: Xu, Hang, et al.
Published: (2025)

MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation
by: Feng, Weilun, et al.
Published: (2025)

Unmasking Bias in Diffusion Model Training
by: Yu, Hu, et al.
Published: (2023)

HieraFashDiff: Hierarchical Fashion Design with Multi-stage Diffusion Models
by: Xie, Zhifeng, et al.
Published: (2024)

Locate n' Rotate: Two-stage Openable Part Detection with Foundation Model Priors
by: Li, Siqi, et al.
Published: (2024)

Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation
by: Huang, Yuhang, et al.
Published: (2023)

Chest-Diffusion: A Light-Weight Text-to-Image Model for Report-to-CXR Generation
by: Huang, Peng, et al.
Published: (2024)

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
by: Hong, Fa-Ting, et al.
Published: (2025)