:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xie, Kangyang, Yang, Binbin, Chen, Hao, Wang, Meng, Zou, Cheng, Xue, Hui, Yang, Ming, Shen, Chunhua
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.11077
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models
by: Wang, Wen, et al.
Published: (2023)

What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
by: Xu, Guangkai, et al.
Published: (2024)

Generative Video Matting
by: Ge, Yongtao, et al.
Published: (2025)

ZipGait: Bridging Skeleton and Silhouette with Diffusion Model for Advancing Gait Recognition
by: Min, Fanxu, et al.
Published: (2024)

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
by: Li, Wen, et al.
Published: (2024)

Diffusion Models are Efficient Data Generators for Human Mesh Recovery
by: Ge, Yongtao, et al.
Published: (2024)

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior
by: Chen, Zhekai, et al.
Published: (2024)

Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation
by: Zhu, Muzhi, et al.
Published: (2024)

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
by: Zhu, Muzhi, et al.
Published: (2025)

Traffic Scene Parsing through the TSP6K Dataset
by: Jiang, Peng-Tao, et al.
Published: (2023)

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data
by: Fan, Chengxiang, et al.
Published: (2024)

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
by: Zhu, Muzhi, et al.
Published: (2025)

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models
by: Lai, Yingxin, et al.
Published: (2026)

Video Virtual Try-on with Conditional Diffusion Transformer Inpainter
by: Zou, Cheng, et al.
Published: (2025)

RGM: A Robust Generalizable Matching Model
by: Zhang, Songyan, et al.
Published: (2023)

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
by: Xu, Shaocong, et al.
Published: (2025)

ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
by: Yu, Yongsheng, et al.
Published: (2025)

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
by: Zhao, Canyu, et al.
Published: (2025)

Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization
by: Zhao, Canyu, et al.
Published: (2025)

Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
by: Liu, Yang, et al.
Published: (2023)

Object-aware Inversion and Reassembly for Image Editing
by: Yang, Zhen, et al.
Published: (2023)

UnZipLoRA: Separating Content and Style from a Single Image
by: Liu, Chang, et al.
Published: (2024)

HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models
by: Mao, Shizhuo, et al.
Published: (2025)

Unpaired Deblurring via Decoupled Diffusion Model
by: Cheng, Junhao, et al.
Published: (2025)

A Geometric Perspective on Diffusion Models
by: Chen, Defang, et al.
Published: (2023)

MARBLE: Multi-Aspect Reward Balance for Diffusion RL
by: Zhao, Canyu, et al.
Published: (2026)

A Single Neuron Works: Precise Concept Erasure in Text-to-Image Diffusion Models
by: He, Qinqin, et al.
Published: (2025)

VisionZip: Longer is Better but Not Necessary in Vision Language Models
by: Yang, Senqiao, et al.
Published: (2024)

Paragraph-to-Image Generation with Information-Enriched Diffusion Model
by: Wu, Weijia, et al.
Published: (2023)

Towards a Transparent and Interpretable AI Model for Medical Image Classifications
by: Wen, Binbin, et al.
Published: (2025)

StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
by: Liu, Mingyu, et al.
Published: (2025)

DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models
by: Wu, Weijia, et al.
Published: (2023)

Generative Active Learning for Long-tailed Instance Segmentation
by: Zhu, Muzhi, et al.
Published: (2024)

FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition
by: Ding, Ganggui, et al.
Published: (2024)

A Simple Image Segmentation Framework via In-Context Examples
by: Liu, Yang, et al.
Published: (2024)

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal
by: Wang, Tao, et al.
Published: (2024)

Distribution-Aware Data Expansion with Diffusion Models
by: Zhu, Haowei, et al.
Published: (2024)

On the Trajectory Regularity of ODE-based Diffusion Sampling
by: Chen, Defang, et al.
Published: (2024)

FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models
by: Chen, Changgu, et al.
Published: (2024)

HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
by: Chen, Cong, et al.
Published: (2025)