Saved in:
| Main Authors: | Xia, Tianxiang, Xiao, Lin, Montorfani, Yannick, Pavia, Francesco, Simsar, Enis, Hofmann, Thomas |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.09055 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FOCUS: Optimal Control for Multi-Entity World Modeling in Text-to-Image Generation
by: Bill, Eric Tillmann, et al.
Published: (2025)
by: Bill, Eric Tillmann, et al.
Published: (2025)
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
by: Simsar, Enis, et al.
Published: (2024)
by: Simsar, Enis, et al.
Published: (2024)
FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation
by: Bill, Eric Tillmann, et al.
Published: (2026)
by: Bill, Eric Tillmann, et al.
Published: (2026)
MegaPortrait: Revisiting Diffusion Control for High-fidelity Portrait Generation
by: Yang, Han, et al.
Published: (2024)
by: Yang, Han, et al.
Published: (2024)
Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation
by: Meral, Tuna Han Salih, et al.
Published: (2024)
by: Meral, Tuna Han Salih, et al.
Published: (2024)
JEDI: The Force of Jensen-Shannon Divergence in Disentangling Diffusion Models
by: Bill, Eric Tillmann, et al.
Published: (2025)
by: Bill, Eric Tillmann, et al.
Published: (2025)
LIME: Localized Image Editing via Attention Regularization in Diffusion Models
by: Simsar, Enis, et al.
Published: (2023)
by: Simsar, Enis, et al.
Published: (2023)
UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint
by: Simsar, Enis, et al.
Published: (2024)
by: Simsar, Enis, et al.
Published: (2024)
Stylebreeder: Exploring and Democratizing Artistic Styles through Text-to-Image Models
by: Zheng, Matthew, et al.
Published: (2024)
by: Zheng, Matthew, et al.
Published: (2024)
IC-Portrait: In-Context Matching for View-Consistent Personalized Portrait
by: Yang, Han, et al.
Published: (2025)
by: Yang, Han, et al.
Published: (2025)
PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM
by: Stefanache, Stefan, et al.
Published: (2024)
by: Stefanache, Stefan, et al.
Published: (2024)
High Fidelity Text to Image Generation with Contrastive Alignment and Structural Guidance
by: Gao, Danyi
Published: (2025)
by: Gao, Danyi
Published: (2025)
Shifting the Breaking Point of Flow Matching for Multi-Instance Editing
by: Zaccagnino, Carmine, et al.
Published: (2026)
by: Zaccagnino, Carmine, et al.
Published: (2026)
ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting
by: Liu, Junbang, et al.
Published: (2025)
by: Liu, Junbang, et al.
Published: (2025)
RefAM: Attention Magnets for Zero-Shot Referral Segmentation
by: Kukleva, Anna, et al.
Published: (2025)
by: Kukleva, Anna, et al.
Published: (2025)
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
by: Wang, Cong, et al.
Published: (2023)
by: Wang, Cong, et al.
Published: (2023)
Counting Guidance for High Fidelity Text-to-Image Synthesis
by: Kang, Wonjun, et al.
Published: (2023)
by: Kang, Wonjun, et al.
Published: (2023)
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
by: Hamamci, Ibrahim Ethem, et al.
Published: (2023)
by: Hamamci, Ibrahim Ethem, et al.
Published: (2023)
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
by: Huang, Siteng, et al.
Published: (2023)
by: Huang, Siteng, et al.
Published: (2023)
TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing
by: Zhang, Xinyu, et al.
Published: (2024)
by: Zhang, Xinyu, et al.
Published: (2024)
FlowDreamer: Exploring High Fidelity Text-to-3D Generation via Rectified Flow
by: Li, Hangyu, et al.
Published: (2024)
by: Li, Hangyu, et al.
Published: (2024)
Text2Interact: High-Fidelity and Diverse Text-to-Two-Person Interaction Generation
by: Wu, Qingxuan, et al.
Published: (2025)
by: Wu, Qingxuan, et al.
Published: (2025)
Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition
by: Lin, Tiancheng, et al.
Published: (2024)
by: Lin, Tiancheng, et al.
Published: (2024)
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
by: Wang, Haoran, et al.
Published: (2022)
by: Wang, Haoran, et al.
Published: (2022)
OpenAI ChatGPT interprets Radiological Images: GPT-4 as a Medical Doctor for a Fast Check-Up
by: Aydin, Omer, et al.
Published: (2025)
by: Aydin, Omer, et al.
Published: (2025)
Text-Conditioned Diffusion Model for High-Fidelity Korean Font Generation
by: Sami, Abdul, et al.
Published: (2025)
by: Sami, Abdul, et al.
Published: (2025)
AtomoVideo: High Fidelity Image-to-Video Generation
by: Gong, Litong, et al.
Published: (2024)
by: Gong, Litong, et al.
Published: (2024)
TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models
by: Sampaio, Georgia Gabriela, et al.
Published: (2024)
by: Sampaio, Georgia Gabriela, et al.
Published: (2024)
Structure Observation Driven Image-Text Contrastive Learning for Computed Tomography Report Generation
by: Liu, Hong, et al.
Published: (2026)
by: Liu, Hong, et al.
Published: (2026)
VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation
by: Chen, Zixuan, et al.
Published: (2024)
by: Chen, Zixuan, et al.
Published: (2024)
Unifying Contrastive and Generative Objectives for Visual Understanding and Text-to-Image Generation
by: Li, Chao, et al.
Published: (2026)
by: Li, Chao, et al.
Published: (2026)
TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis
by: Rahman, Kazi Mahathir, et al.
Published: (2025)
by: Rahman, Kazi Mahathir, et al.
Published: (2025)
InsightTok: Improving Text and Face Fidelity in Discrete Tokenization for Autoregressive Image Generation
by: Yue, Yang, et al.
Published: (2026)
by: Yue, Yang, et al.
Published: (2026)
Improving Viewpoint-Invariance and Temporal Consistency for Action Detection
by: Porto, Yannick, et al.
Published: (2026)
by: Porto, Yannick, et al.
Published: (2026)
NURBGen: High-Fidelity Text-to-CAD Generation through LLM-Driven NURBS Modeling
by: Usama, Muhammad, et al.
Published: (2025)
by: Usama, Muhammad, et al.
Published: (2025)
DreamText: High Fidelity Scene Text Synthesis
by: Wang, Yibin, et al.
Published: (2024)
by: Wang, Yibin, et al.
Published: (2024)
Right Looks, Wrong Reasons: Compositional Fidelity in Text-to-Image Generation
by: Vatsa, Mayank, et al.
Published: (2025)
by: Vatsa, Mayank, et al.
Published: (2025)
Bridging Text and Image for Artist Style Transfer via Contrastive Learning
by: Liu, Zhi-Song, et al.
Published: (2024)
by: Liu, Zhi-Song, et al.
Published: (2024)
Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment
by: Le, Minh-Quan, et al.
Published: (2025)
by: Le, Minh-Quan, et al.
Published: (2025)
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation
by: Li, Weijie, et al.
Published: (2024)
by: Li, Weijie, et al.
Published: (2024)
Similar Items
-
FOCUS: Optimal Control for Multi-Entity World Modeling in Text-to-Image Generation
by: Bill, Eric Tillmann, et al.
Published: (2025) -
LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
by: Simsar, Enis, et al.
Published: (2024) -
FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation
by: Bill, Eric Tillmann, et al.
Published: (2026) -
MegaPortrait: Revisiting Diffusion Control for High-fidelity Portrait Generation
by: Yang, Han, et al.
Published: (2024) -
Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation
by: Meral, Tuna Han Salih, et al.
Published: (2024)