Saved in:
| Main Authors: | Wang, Haofan, Xu, Yujia, Li, Yimeng, Li, Junchen, Zhang, Chaowei, Wang, Jing, Yang, Kejia, Chen, Zhibo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.19724 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
by: Lu, Runnan, et al.
Published: (2025)
by: Lu, Runnan, et al.
Published: (2025)
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
by: Chen, Zhennan, et al.
Published: (2024)
by: Chen, Zhennan, et al.
Published: (2024)
First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
by: Li, Zhenhang, et al.
Published: (2024)
by: Li, Zhenhang, et al.
Published: (2024)
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
by: Zhu, Hanshen, et al.
Published: (2026)
by: Zhu, Hanshen, et al.
Published: (2026)
Visual Text Generation in the Wild
by: Zhu, Yuanzhi, et al.
Published: (2024)
by: Zhu, Yuanzhi, et al.
Published: (2024)
TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering
by: Mao, Dongxing, et al.
Published: (2026)
by: Mao, Dongxing, et al.
Published: (2026)
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation
by: Wang, Haofan, et al.
Published: (2024)
by: Wang, Haofan, et al.
Published: (2024)
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
by: Liu, Zeyu, et al.
Published: (2024)
by: Liu, Zeyu, et al.
Published: (2024)
CSGO: Content-Style Composition in Text-to-Image Generation
by: Xing, Peng, et al.
Published: (2024)
by: Xing, Peng, et al.
Published: (2024)
FreeText: Training-Free Text Rendering in Diffusion Transformers via Attention Localization and Spectral Glyph Injection
by: Zhang, Ruiqiang, et al.
Published: (2026)
by: Zhang, Ruiqiang, et al.
Published: (2026)
TextAlign: Preference Alignment for Text Rendering with Hierarchical Rewards
by: Cui, Mingxuan, et al.
Published: (2026)
by: Cui, Mingxuan, et al.
Published: (2026)
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
by: Liu, Jiawei, et al.
Published: (2025)
by: Liu, Jiawei, et al.
Published: (2025)
InstantStyle-Plus: Style Transfer with Content-Preserving in Text-to-Image Generation
by: Wang, Haofan, et al.
Published: (2024)
by: Wang, Haofan, et al.
Published: (2024)
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
by: Peng, Yuyang, et al.
Published: (2025)
by: Peng, Yuyang, et al.
Published: (2025)
Towards High-Fidelity CAD Generation via LLM-Driven Program Generation and Text-Based B-Rep Primitive Grounding
by: Li, Jiahao, et al.
Published: (2026)
by: Li, Jiahao, et al.
Published: (2026)
TextEditBench: Evaluating Reasoning-aware Text Editing Beyond Rendering
by: Gui, Rui, et al.
Published: (2025)
by: Gui, Rui, et al.
Published: (2025)
IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator-Critic Framework
by: Wang, Feiyu, et al.
Published: (2026)
by: Wang, Feiyu, et al.
Published: (2026)
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering
by: Liu, Zeyu, et al.
Published: (2024)
by: Liu, Zeyu, et al.
Published: (2024)
SkyReels-Text: Fine-Grained Font-Controllable Text Editing for Poster Design
by: Yu, Yunjie, et al.
Published: (2025)
by: Yu, Yunjie, et al.
Published: (2025)
TextGuider: Training-Free Guidance for Text Rendering via Attention Alignment
by: Baek, Kanghyun, et al.
Published: (2025)
by: Baek, Kanghyun, et al.
Published: (2025)
RealRep: Generalized SDR-to-HDR Conversion via Attribute-Disentangled Representation Learning
by: Xu, Li, et al.
Published: (2025)
by: Xu, Li, et al.
Published: (2025)
Beyond Flat Text: Dual Self-inherited Guidance for Visual Text Generation
by: Luo, Minxing, et al.
Published: (2025)
by: Luo, Minxing, et al.
Published: (2025)
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering
by: Shuai, Xincheng, et al.
Published: (2026)
by: Shuai, Xincheng, et al.
Published: (2026)
Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering
by: Wang, Xu, et al.
Published: (2025)
by: Wang, Xu, et al.
Published: (2025)
Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning
by: Wang, Yifan, et al.
Published: (2026)
by: Wang, Yifan, et al.
Published: (2026)
Calligrapher: Freestyle Text Image Customization
by: Ma, Yue, et al.
Published: (2025)
by: Ma, Yue, et al.
Published: (2025)
Render-in-the-Loop: Vector Graphics Generation via Visual Self-Feedback
by: Liang, Guotao, et al.
Published: (2026)
by: Liang, Guotao, et al.
Published: (2026)
Enhancing Visual Representation for Text-based Person Searching
by: Shen, Wei, et al.
Published: (2024)
by: Shen, Wei, et al.
Published: (2024)
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
by: Wang, Wenjing, et al.
Published: (2023)
by: Wang, Wenjing, et al.
Published: (2023)
Investigating Text Insulation and Attention Mechanisms for Complex Visual Text Generation
by: Tai, Ying, et al.
Published: (2025)
by: Tai, Ying, et al.
Published: (2025)
VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer
by: Zhong, Humen, et al.
Published: (2024)
by: Zhong, Humen, et al.
Published: (2024)
Platypus: A Generalized Specialist Model for Reading Text in Various Forms
by: Wang, Peng, et al.
Published: (2024)
by: Wang, Peng, et al.
Published: (2024)
Hear the Scene: Audio-Enhanced Text Spotting
by: Li, Jing, et al.
Published: (2024)
by: Li, Jing, et al.
Published: (2024)
Training-Free Occluded Text Rendering via Glyph Priors and Attention-Guided Semantic Blending
by: Hou, Jingqi, et al.
Published: (2026)
by: Hou, Jingqi, et al.
Published: (2026)
TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
by: Wang, Xingrui, et al.
Published: (2024)
by: Wang, Xingrui, et al.
Published: (2024)
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
by: Zhou, Dewei, et al.
Published: (2025)
by: Zhou, Dewei, et al.
Published: (2025)
CLII: Visual-Text Inpainting via Cross-Modal Predictive Interaction
by: Zhao, Liang, et al.
Published: (2024)
by: Zhao, Liang, et al.
Published: (2024)
FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
by: Cai, Kaitong, et al.
Published: (2025)
by: Cai, Kaitong, et al.
Published: (2025)
Visual-RAG: Benchmarking Text-to-Image Retrieval Augmented Generation for Visual Knowledge Intensive Queries
by: Wu, Yin, et al.
Published: (2025)
by: Wu, Yin, et al.
Published: (2025)
MathGen: Revealing the Illusion of Mathematical Competence through Text-to-Image Generation
by: Liu, Ruiyao, et al.
Published: (2026)
by: Liu, Ruiyao, et al.
Published: (2026)
Similar Items
-
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
by: Lu, Runnan, et al.
Published: (2025) -
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
by: Chen, Zhennan, et al.
Published: (2024) -
First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending
by: Li, Zhenhang, et al.
Published: (2024) -
TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering
by: Zhu, Hanshen, et al.
Published: (2026) -
Visual Text Generation in the Wild
by: Zhu, Yuanzhi, et al.
Published: (2024)