Saved in:
| Main Authors: | Shentu, Junjie, Watson, Matthew, Moubayed, Noura Al |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.09966 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
by: Shentu, Junjie, et al.
Published: (2024)
by: Shentu, Junjie, et al.
Published: (2024)
Everything is a Video: Unifying Modalities through Next-Frame Prediction
by: Hudson, G. Thomas, et al.
Published: (2024)
by: Hudson, G. Thomas, et al.
Published: (2024)
Controllable Image Generation with Composed Parallel Token Prediction
by: Stirling, Jamie, et al.
Published: (2024)
by: Stirling, Jamie, et al.
Published: (2024)
Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images
by: Stirling, Jamie S. J., et al.
Published: (2026)
by: Stirling, Jamie S. J., et al.
Published: (2026)
Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction
by: Guo, Hanzhong, et al.
Published: (2026)
by: Guo, Hanzhong, et al.
Published: (2026)
MIEB: Massive Image Embedding Benchmark
by: Xiao, Chenghao, et al.
Published: (2025)
by: Xiao, Chenghao, et al.
Published: (2025)
Disentangling Racial Phenotypes: Fine-Grained Control of Race-related Facial Phenotype Characteristics
by: Yucer, Seyma, et al.
Published: (2024)
by: Yucer, Seyma, et al.
Published: (2024)
Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers
by: Slack, Dean L, et al.
Published: (2025)
by: Slack, Dean L, et al.
Published: (2025)
The Power of Next-Frame Prediction for Learning Physical Laws
by: Winterbottom, Thomas, et al.
Published: (2024)
by: Winterbottom, Thomas, et al.
Published: (2024)
OrienText: Surface Oriented Textual Image Generation
by: Paliwal, Shubham Singh, et al.
Published: (2025)
by: Paliwal, Shubham Singh, et al.
Published: (2025)
AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation
by: He, Junjie, et al.
Published: (2025)
by: He, Junjie, et al.
Published: (2025)
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
by: Shin, Chaehun, et al.
Published: (2024)
by: Shin, Chaehun, et al.
Published: (2024)
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
by: Yao, Zebin, et al.
Published: (2025)
by: Yao, Zebin, et al.
Published: (2025)
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation
by: Zhou, Yufan, et al.
Published: (2024)
by: Zhou, Yufan, et al.
Published: (2024)
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
by: Chen, Hong, et al.
Published: (2023)
by: Chen, Hong, et al.
Published: (2023)
Directional Textual Inversion for Personalized Text-to-Image Generation
by: Kim, Kunhee, et al.
Published: (2025)
by: Kim, Kunhee, et al.
Published: (2025)
Personalized Residuals for Concept-Driven Text-to-Image Generation
by: Ham, Cusuh, et al.
Published: (2024)
by: Ham, Cusuh, et al.
Published: (2024)
Disentangling to Re-couple: Resolving the Similarity-Controllability Paradox in Subject-Driven Text-to-Image Generation
by: Li, Shuang, et al.
Published: (2026)
by: Li, Shuang, et al.
Published: (2026)
LatexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending
by: Jin, Jian, et al.
Published: (2025)
by: Jin, Jian, et al.
Published: (2025)
DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation
by: Tan, Binhong, et al.
Published: (2026)
by: Tan, Binhong, et al.
Published: (2026)
ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation
by: Jin, Hyun-Jun, et al.
Published: (2025)
by: Jin, Hyun-Jun, et al.
Published: (2025)
DeCoT: Decomposing Complex Instructions for Enhanced Text-to-Image Generation with Large Language Models
by: Lin, Xiaochuan, et al.
Published: (2025)
by: Lin, Xiaochuan, et al.
Published: (2025)
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
by: Chan, Kelvin C. K., et al.
Published: (2024)
by: Chan, Kelvin C. K., et al.
Published: (2024)
CustomText: Customized Textual Image Generation using Diffusion Models
by: Paliwal, Shubham, et al.
Published: (2024)
by: Paliwal, Shubham, et al.
Published: (2024)
Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation
by: Wei, Tianyi, et al.
Published: (2024)
by: Wei, Tianyi, et al.
Published: (2024)
CoDi: Subject-Consistent and Pose-Diverse Text-to-Image Generation
by: Gao, Zhanxin, et al.
Published: (2025)
by: Gao, Zhanxin, et al.
Published: (2025)
DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation
by: Hu, Zhenyu, et al.
Published: (2026)
by: Hu, Zhenyu, et al.
Published: (2026)
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
by: Gordon, Brian, et al.
Published: (2023)
by: Gordon, Brian, et al.
Published: (2023)
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
by: Zou, Zhentao, et al.
Published: (2025)
by: Zou, Zhentao, et al.
Published: (2025)
Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
by: Song, Yeji, et al.
Published: (2024)
by: Song, Yeji, et al.
Published: (2024)
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
by: Dahary, Omer, et al.
Published: (2024)
by: Dahary, Omer, et al.
Published: (2024)
CalibCLIP: Contextual Calibration of Dominant Semantics for Text-Driven Image Retrieval
by: Kang, Bin, et al.
Published: (2025)
by: Kang, Bin, et al.
Published: (2025)
PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards
by: Wang, Shulei, et al.
Published: (2025)
by: Wang, Shulei, et al.
Published: (2025)
Identity Decoupling for Multi-Subject Personalization of Text-to-Image Models
by: Jang, Sangwon, et al.
Published: (2024)
by: Jang, Sangwon, et al.
Published: (2024)
Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation
by: Liu, Ziyue, et al.
Published: (2026)
by: Liu, Ziyue, et al.
Published: (2026)
SceneBooth: Diffusion-based Framework for Subject-preserved Text-to-Image Generation
by: Chai, Shang, et al.
Published: (2025)
by: Chai, Shang, et al.
Published: (2025)
CustomContrast: A Multilevel Contrastive Perspective For Subject-Driven Text-to-Image Customization
by: Chen, Nan, et al.
Published: (2024)
by: Chen, Nan, et al.
Published: (2024)
Multi-Subject Image Synthesis as a Generative Prior for Single-Subject PET Image Reconstruction
by: Webber, George, et al.
Published: (2024)
by: Webber, George, et al.
Published: (2024)
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
by: Tan, Zhiyu, et al.
Published: (2024)
by: Tan, Zhiyu, et al.
Published: (2024)
Geometric Disentanglement of Text Embeddings for Subject-Consistent Text-to-Image Generation using A Single Prompt
by: Li, Shangxun, et al.
Published: (2025)
by: Li, Shangxun, et al.
Published: (2025)
Similar Items
-
AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
by: Shentu, Junjie, et al.
Published: (2024) -
Everything is a Video: Unifying Modalities through Next-Frame Prediction
by: Hudson, G. Thomas, et al.
Published: (2024) -
Controllable Image Generation with Composed Parallel Token Prediction
by: Stirling, Jamie, et al.
Published: (2024) -
Investigating Permutation-Invariant Discrete Representation Learning for Spatially Aligned Images
by: Stirling, Jamie S. J., et al.
Published: (2026) -
Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction
by: Guo, Hanzhong, et al.
Published: (2026)