:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shentu, Junjie, Watson, Matthew, Moubayed, Noura Al
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2402.09966
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
by: Shentu, Junjie, et al.
Published: (2024)

Everything is a Video: Unifying Modalities through Next-Frame Prediction
by: Hudson, G. Thomas, et al.
Published: (2024)

Controllable Image Generation with Composed Parallel Token Prediction
by: Stirling, Jamie, et al.
Published: (2024)

Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images
by: Stirling, Jamie S. J., et al.
Published: (2026)

Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction
by: Guo, Hanzhong, et al.
Published: (2026)

MIEB: Massive Image Embedding Benchmark
by: Xiao, Chenghao, et al.
Published: (2025)

Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics
by: Yucer, Seyma, et al.
Published: (2024)

Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers
by: Slack, Dean L, et al.
Published: (2025)

The Power of Next-Frame Prediction for Learning Physical Laws
by: Winterbottom, Thomas, et al.
Published: (2024)

OrienText: Surface Oriented Textual Image Generation
by: Paliwal, Shubham Singh, et al.
Published: (2025)

AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation
by: He, Junjie, et al.
Published: (2025)

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
by: Shin, Chaehun, et al.
Published: (2024)

FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
by: Yao, Zebin, et al.
Published: (2025)

Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
by: Zhou, Yufan, et al.
Published: (2024)

DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
by: Chen, Hong, et al.
Published: (2023)

Directional Textual Inversion for Personalized Text-to-Image Generation
by: Kim, Kunhee, et al.
Published: (2025)

Personalized Residuals for Concept-Driven Text-to-Image Generation
by: Ham, Cusuh, et al.
Published: (2024)

Disentangling to Re-couple: Resolving the Similarity-Controllability Paradox in Subject-Driven Text-to-Image Generation
by: Li, Shuang, et al.
Published: (2026)

LatexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
by: Jin, Jian, et al.
Published: (2025)

DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation
by: Tan, Binhong, et al.
Published: (2026)

ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation
by: Jin, Hyun-Jun, et al.
Published: (2025)

DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models
by: Lin, Xiaochuan, et al.
Published: (2025)

Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
by: Chan, Kelvin C. K., et al.
Published: (2024)

CustomText: Customized Textual Image Generation using Diffusion Models
by: Paliwal, Shubham, et al.
Published: (2024)

Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
by: Wei, Tianyi, et al.
Published: (2024)

CoDi: Subject-Consistent and Pose-Diverse Text-to-Image Generation
by: Gao, Zhanxin, et al.
Published: (2025)

DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation
by: Hu, Zhenyu, et al.
Published: (2026)

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
by: Gordon, Brian, et al.
Published: (2023)

Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
by: Zou, Zhentao, et al.
Published: (2025)

Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
by: Song, Yeji, et al.
Published: (2024)

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
by: Dahary, Omer, et al.
Published: (2024)

CalibCLIP: Contextual Calibration of Dominant Semantics for Text-Driven Image Retrieval
by: Kang, Bin, et al.
Published: (2025)

PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards
by: Wang, Shulei, et al.
Published: (2025)

Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
by: Jang, Sangwon, et al.
Published: (2024)

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation
by: Liu, Ziyue, et al.
Published: (2026)

SceneBooth: Diffusion-based Framework for Subject-preserved Text-to-Image Generation
by: Chai, Shang, et al.
Published: (2025)

CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
by: Chen, Nan, et al.
Published: (2024)

Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction
by: Webber, George, et al.
Published: (2024)

An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
by: Tan, Zhiyu, et al.
Published: (2024)

Geometric Disentanglement of Text Embeddings for Subject-Consistent Text-to-Image Generation using A Single Prompt
by: Li, Shangxun, et al.
Published: (2025)