:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shah, Jaineet, Gromis, Michael, Pinto, Rickston
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2412.14422
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

JetFormer: An Autoregressive Generative Model of Raw Images and Text
by: Tschannen, Michael, et al.
Published: (2024)

ReGuidance: A Simple Diffusion Wrapper for Boosting Sample Quality on Hard Inverse Problems
by: Karan, Aayush, et al.
Published: (2025)

From Text to Pose to Image: Improving Diffusion Model Control and Quality
by: Bonnet, Clément, et al.
Published: (2024)

License Plate Images Generation with Diffusion Models
by: Shpir, Mariia, et al.
Published: (2025)

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
by: Melas-Kyriazi, Luke, et al.
Published: (2024)

Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation
by: Chen, Tianyu, et al.
Published: (2025)

Cortex-Grounded Diffusion Models for Brain Image Generation
by: Bongratz, Fabian, et al.
Published: (2026)

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
by: Chadebec, Clément, et al.
Published: (2024)

Contextualized Diffusion Models for Text-Guided Image and Video Generation
by: Yang, Ling, et al.
Published: (2024)

Diffusion Models in Vision: A Survey
by: Croitoru, Florinel-Alin, et al.
Published: (2022)

ClipGrader: Leveraging Vision-Language Models for Robust Label Quality Assessment in Object Detection
by: Lu, Hong, et al.
Published: (2025)

Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation
by: Ogezi, Michael, et al.
Published: (2024)

Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation
by: Croitoru, Florinel-Alin, et al.
Published: (2026)

DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation
by: Venugopal, Sankarshana, et al.
Published: (2026)

Quantized Machine Learning Models for Medical Imaging in Low-Resource Healthcare Settings
by: Kanneti, Sumanth Meenan, et al.
Published: (2026)

PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation
by: Wang, Ziyan, et al.
Published: (2025)

On the Scalability of Diffusion-based Text-to-Image Generation
by: Li, Hao, et al.
Published: (2024)

Dual Diffusion for Unified Image Generation and Understanding
by: Li, Zijie, et al.
Published: (2024)

DiffiT: Diffusion Vision Transformers for Image Generation
by: Hatamizadeh, Ali, et al.
Published: (2023)

Diffuse and Disperse: Image Generation with Representation Regularization
by: Wang, Runqian, et al.
Published: (2025)

Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models
by: Lavoie, Samuel, et al.
Published: (2025)

Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
by: Luo, Ge Ya, et al.
Published: (2024)

Latent Diffusion Models with Image-Derived Annotations for Enhanced AI-Assisted Cancer Diagnosis in Histopathology
by: Osorio, Pedro, et al.
Published: (2023)

Curriculum Direct Preference Optimization for Diffusion and Consistency Models
by: Croitoru, Florinel-Alin, et al.
Published: (2024)

Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models
by: Lopez, Eleonora, et al.
Published: (2024)

InstanceDiffusion: Instance-level Control for Image Generation
by: Wang, Xudong, et al.
Published: (2024)

Rethinking Diffusion Model in High Dimension
by: Zheng, Zhenxin, et al.
Published: (2025)

Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models
by: Daras, Giannis, et al.
Published: (2024)

VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models
by: Kwon, Taesung, et al.
Published: (2024)

Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
by: Wang, Junyan, et al.
Published: (2024)

Foresight: Adaptive Layer Reuse for Accelerated and High-Quality Text-to-Video Generation
by: Adnan, Muhammad, et al.
Published: (2025)

Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation
by: Yuan, Huizhuo, et al.
Published: (2024)

Editing Massive Concepts in Text-to-Image Diffusion Models
by: Xiong, Tianwei, et al.
Published: (2024)

Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification
by: Shah, Manan, et al.
Published: (2024)

Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
by: Crowson, Katherine, et al.
Published: (2024)

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
by: Yang, Ling, et al.
Published: (2024)

Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation
by: Mo, Shentong, et al.
Published: (2024)

Input-Adaptive Generative Dynamics in Diffusion Models
by: Xing, Yucheng, et al.
Published: (2024)

Generative Dataset Distillation Based on Diffusion Model
by: Su, Duo, et al.
Published: (2024)