Saved in:
| Main Authors: | Yan, Zexuan, Jin, Jiarui, Ma, Yue, Wang, Shijian, Hu, Jiahui, Jiao, Wenxiang, Lu, Yuan, Zhang, Linfeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.12155 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AgentDisCo: Towards Disentanglement and Collaboration in Open-ended Deep Research Agents
by: Jin, Jiarui, et al.
Published: (2026)
by: Jin, Jiarui, et al.
Published: (2026)
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering
by: Shuai, Xincheng, et al.
Published: (2026)
by: Shuai, Xincheng, et al.
Published: (2026)
MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences
by: Wang, Shijian, et al.
Published: (2026)
by: Wang, Shijian, et al.
Published: (2026)
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
by: Liu, Zeyu, et al.
Published: (2024)
by: Liu, Zeyu, et al.
Published: (2024)
TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control
by: Yan, Zhenyu, et al.
Published: (2024)
by: Yan, Zhenyu, et al.
Published: (2024)
UniGlyph: Unified Segmentation-Conditioned Diffusion for Precise Visual Text Synthesis
by: Wang, Yuanrui, et al.
Published: (2025)
by: Wang, Yuanrui, et al.
Published: (2025)
FreeText: Training-Free Text Rendering in Diffusion Transformers via Attention Localization and Spectral Glyph Injection
by: Zhang, Ruiqiang, et al.
Published: (2026)
by: Zhang, Ruiqiang, et al.
Published: (2026)
GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
by: Wang, Tong, et al.
Published: (2025)
by: Wang, Tong, et al.
Published: (2025)
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering
by: Liu, Zeyu, et al.
Published: (2024)
by: Liu, Zeyu, et al.
Published: (2024)
Training-Free Occluded Text Rendering via Glyph Priors and Attention-Guided Semantic Blending
by: Hou, Jingqi, et al.
Published: (2026)
by: Hou, Jingqi, et al.
Published: (2026)
EEdit: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
by: Yan, Zexuan, et al.
Published: (2025)
by: Yan, Zexuan, et al.
Published: (2025)
SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting
by: Zhang, Jiahui, et al.
Published: (2025)
by: Zhang, Jiahui, et al.
Published: (2025)
HDGlyph: A Hierarchical Disentangled Glyph-Based Framework for Long-Tail Text Rendering in Diffusion Models
by: Zhuang, Shuhan, et al.
Published: (2025)
by: Zhuang, Shuhan, et al.
Published: (2025)
AnyArtisticGlyph: Multilingual Controllable Artistic Glyph Generation
by: Lu, Xiongbo, et al.
Published: (2025)
by: Lu, Xiongbo, et al.
Published: (2025)
OmniGAIA: Towards Native Omni-Modal AI Agents
by: Li, Xiaoxi, et al.
Published: (2026)
by: Li, Xiaoxi, et al.
Published: (2026)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts
by: He, Junwen, et al.
Published: (2024)
by: He, Junwen, et al.
Published: (2024)
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
by: Ma, Jian, et al.
Published: (2024)
by: Ma, Jian, et al.
Published: (2024)
Synthetic Curriculum Reinforces Compositional Text-to-Image Generation
by: Wang, Shijian, et al.
Published: (2025)
by: Wang, Shijian, et al.
Published: (2025)
Tool-Genesis: A Task-Driven Tool Creation Benchmark for Self-Evolving Language Agent
by: Xia, Bowei, et al.
Published: (2026)
by: Xia, Bowei, et al.
Published: (2026)
WaveShot: A Compact Portable Unmanned Surface Vessel for Dynamic Water Surface Videography and Media Production
by: Ma, Shijian, et al.
Published: (2024)
by: Ma, Shijian, et al.
Published: (2024)
Agent Banana: High-Fidelity Image Editing with Agentic Thinking and Tooling
by: Ye, Ruijie, et al.
Published: (2026)
by: Ye, Ruijie, et al.
Published: (2026)
DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation
by: Pan, Wei, et al.
Published: (2025)
by: Pan, Wei, et al.
Published: (2025)
LOGO: Video Text Spotting with Language Collaboration and Glyph Perception Model
by: Liu, Hongen, et al.
Published: (2024)
by: Liu, Hongen, et al.
Published: (2024)
Glyph: Scaling Context Windows via Visual-Text Compression
by: Cheng, Jiale, et al.
Published: (2025)
by: Cheng, Jiale, et al.
Published: (2025)
LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation
by: Yuan, Linfeng, et al.
Published: (2023)
by: Yuan, Linfeng, et al.
Published: (2023)
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation
by: Peng, Yuyang, et al.
Published: (2025)
by: Peng, Yuyang, et al.
Published: (2025)
Decoupling Layout from Glyph in Online Chinese Handwriting Generation
by: Ren, Min-Si, et al.
Published: (2024)
by: Ren, Min-Si, et al.
Published: (2024)
FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization
by: Zhang, Jiahui, et al.
Published: (2024)
by: Zhang, Jiahui, et al.
Published: (2024)
Versatile Transition Generation with Image-to-Video Diffusion
by: Yang, Zuhao, et al.
Published: (2025)
by: Yang, Zuhao, et al.
Published: (2025)
PCR-GS: COLMAP-Free 3D Gaussian Splatting via Pose Co-Regularizations
by: Wei, Yu, et al.
Published: (2025)
by: Wei, Yu, et al.
Published: (2025)
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning
by: Wang, Shijian, et al.
Published: (2025)
by: Wang, Shijian, et al.
Published: (2025)
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
by: Qian, Yusu, et al.
Published: (2025)
by: Qian, Yusu, et al.
Published: (2025)
WeEdit: A Dataset, Benchmark and Glyph-Guided Framework for Text-centric Image Editing
by: Zhang, Hui, et al.
Published: (2026)
by: Zhang, Hui, et al.
Published: (2026)
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
by: Lu, Runnan, et al.
Published: (2025)
by: Lu, Runnan, et al.
Published: (2025)
Stroke Modeling Enables Vectorized Character Generation with Large Vectorized Glyph Model
by: Zhang, Xinyue, et al.
Published: (2025)
by: Zhang, Xinyue, et al.
Published: (2025)
TextPixs: Glyph-Conditioned Diffusion with Character-Aware Attention and OCR-Guided Supervision
by: Gillani, Syeda Anshrah, et al.
Published: (2025)
by: Gillani, Syeda Anshrah, et al.
Published: (2025)
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
by: Jiang, Bowen, et al.
Published: (2025)
by: Jiang, Bowen, et al.
Published: (2025)
Empowering Backbone Models for Visual Text Generation with Input Granularity Control and Glyph-Aware Training
by: Li, Wenbo, et al.
Published: (2024)
by: Li, Wenbo, et al.
Published: (2024)
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
by: Lakhanpal, Sanyam, et al.
Published: (2024)
by: Lakhanpal, Sanyam, et al.
Published: (2024)
Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance
by: Luo, Minxing, et al.
Published: (2025)
by: Luo, Minxing, et al.
Published: (2025)
Similar Items
-
AgentDisCo: Towards Disentanglement and Collaboration in Open-ended Deep Research Agents
by: Jin, Jiarui, et al.
Published: (2026) -
GlyphPrinter: Region-Grouped Direct Preference Optimization for Glyph-Accurate Visual Text Rendering
by: Shuai, Xincheng, et al.
Published: (2026) -
MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences
by: Wang, Shijian, et al.
Published: (2026) -
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
by: Liu, Zeyu, et al.
Published: (2024) -
TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control
by: Yan, Zhenyu, et al.
Published: (2024)