Saved in:
| Main Authors: | Liu, Wenjun, Wu, Qian, Hu, Yifeng, Li, Yuke |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.14050 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
by: Li, Xinyang, et al.
Published: (2024)
by: Li, Xinyang, et al.
Published: (2024)
EK-Net:Real-time Scene Text Detection with Expand Kernel Distance
by: Zhu, Boyuan, et al.
Published: (2024)
by: Zhu, Boyuan, et al.
Published: (2024)
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification
by: Feuer, Benjamin, et al.
Published: (2024)
by: Feuer, Benjamin, et al.
Published: (2024)
DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling
by: Li, Haoran, et al.
Published: (2024)
by: Li, Haoran, et al.
Published: (2024)
Beyond Detection: A Structure-Aware Framework for Scene Text Tracking
by: Yu, Chenmin, et al.
Published: (2026)
by: Yu, Chenmin, et al.
Published: (2026)
Text Region Multiple Information Perception Network for Scene Text Detection
by: Zheng, Jinzhi, et al.
Published: (2024)
by: Zheng, Jinzhi, et al.
Published: (2024)
Real-Time Text Detection with Similar Mask in Traffic, Industrial, and Natural Scenes
by: Han, Xu, et al.
Published: (2024)
by: Han, Xu, et al.
Published: (2024)
HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition
by: Wu, Qian, et al.
Published: (2024)
by: Wu, Qian, et al.
Published: (2024)
STELLAR: Scene Text Editor for Low-Resource Languages and Real-World Data
by: Seo, Yongdeuk, et al.
Published: (2025)
by: Seo, Yongdeuk, et al.
Published: (2025)
Aggregated Text Transformer for Scene Text Detection
by: Zhou, Zhao, et al.
Published: (2022)
by: Zhou, Zhao, et al.
Published: (2022)
Explicit Relational Reasoning Network for Scene Text Detection
by: Su, Yuchen, et al.
Published: (2024)
by: Su, Yuchen, et al.
Published: (2024)
Towards Real-world Lens Active Alignment with Unlabeled Data via Domain Adaptation
by: Li, Wenyong, et al.
Published: (2026)
by: Li, Wenyong, et al.
Published: (2026)
Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss
by: Ren, Xuhua, et al.
Published: (2024)
by: Ren, Xuhua, et al.
Published: (2024)
Real-world Image Dehazing with Coherence-based Pseudo Labeling and Cooperative Unfolding Network
by: Fang, Chengyu, et al.
Published: (2024)
by: Fang, Chengyu, et al.
Published: (2024)
Toward Real-world Text Image Forgery Localization: Structured and Interpretable Data Synthesis
by: Yu, Zeqin, et al.
Published: (2025)
by: Yu, Zeqin, et al.
Published: (2025)
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
by: Huang, Mengqi, et al.
Published: (2024)
by: Huang, Mengqi, et al.
Published: (2024)
Duplex-GS: Proxy-Guided Weighted Blending for Real-Time Order-Independent Gaussian Splatting
by: Liu, Weihang, et al.
Published: (2025)
by: Liu, Weihang, et al.
Published: (2025)
GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
by: Wang, Tong, et al.
Published: (2025)
by: Wang, Tong, et al.
Published: (2025)
Auditing Data Provenance in Real-world Text-to-Image Diffusion Models for Privacy and Copyright Protection
by: Zhu, Jie, et al.
Published: (2025)
by: Zhu, Jie, et al.
Published: (2025)
Multi-Label Stereo Matching for Transparent Scene Depth Estimation
by: Liu, Zhidan, et al.
Published: (2025)
by: Liu, Zhidan, et al.
Published: (2025)
The First Swahili Language Scene Text Detection and Recognition Dataset
by: Douamba, Fadila Wendigoundi, et al.
Published: (2024)
by: Douamba, Fadila Wendigoundi, et al.
Published: (2024)
You Only Gaussian Once: Controllable 3D Gaussian Splatting for Ultra-Densely Sampled Scenes
by: Jia, Jinrang, et al.
Published: (2026)
by: Jia, Jinrang, et al.
Published: (2026)
Learning to Segment Liquids in Real-world Images
by: Li, Jonas, et al.
Published: (2026)
by: Li, Jonas, et al.
Published: (2026)
TeleOR: Real-time Telemedicine System for Full-Scene Operating Room
by: Wu, Yixuan, et al.
Published: (2024)
by: Wu, Yixuan, et al.
Published: (2024)
BPDO:Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text Detection
by: Zheng, Jinzhi, et al.
Published: (2024)
by: Zheng, Jinzhi, et al.
Published: (2024)
LEMoN: Label Error Detection using Multimodal Neighbors
by: Zhang, Haoran, et al.
Published: (2024)
by: Zhang, Haoran, et al.
Published: (2024)
Vision-based Manipulation from Single Human Video with Open-World Object Graphs
by: Zhu, Yifeng, et al.
Published: (2024)
by: Zhu, Yifeng, et al.
Published: (2024)
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery
by: Wan, Weikang, et al.
Published: (2023)
by: Wan, Weikang, et al.
Published: (2023)
CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection
by: Zhao, Xi, et al.
Published: (2022)
by: Zhao, Xi, et al.
Published: (2022)
Text-IRSTD: Leveraging Semantic Text to Promote Infrared Small Target Detection in Complex Scenes
by: Huang, Feng, et al.
Published: (2025)
by: Huang, Feng, et al.
Published: (2025)
Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation
by: Zhao, Yitao, et al.
Published: (2025)
by: Zhao, Yitao, et al.
Published: (2025)
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
by: Ye, Xingsong, et al.
Published: (2024)
by: Ye, Xingsong, et al.
Published: (2024)
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
by: Guan, Tongkun, et al.
Published: (2023)
by: Guan, Tongkun, et al.
Published: (2023)
AnyText2: Visual Text Generation and Editing With Customizable Attributes
by: Tuo, Yuxiang, et al.
Published: (2024)
by: Tuo, Yuxiang, et al.
Published: (2024)
Research on Multilingual Natural Scene Text Detection Algorithm
by: Wang, Tao
Published: (2023)
by: Wang, Tao
Published: (2023)
Self-Assessed Generation: Trustworthy Label Generation for Optical Flow and Stereo Matching in Real-world
by: Ling, Han, et al.
Published: (2024)
by: Ling, Han, et al.
Published: (2024)
TextFlux: An OCR-Free DiT Model for High-Fidelity Multilingual Scene Text Synthesis
by: Xie, Yu, et al.
Published: (2025)
by: Xie, Yu, et al.
Published: (2025)
Partial Scene Text Retrieval
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction
by: Li, Bate, et al.
Published: (2025)
by: Li, Bate, et al.
Published: (2025)
Self-Prompting Diffusion Transformer for Open-Vocabulary Scene Text Editing via In-Context Learning
by: Li, Hongxi, et al.
Published: (2026)
by: Li, Hongxi, et al.
Published: (2026)
Similar Items
-
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
by: Li, Xinyang, et al.
Published: (2024) -
EK-Net:Real-time Scene Text Detection with Expand Kernel Distance
by: Zhu, Boyuan, et al.
Published: (2024) -
SELECT: A Large-Scale Benchmark of Data Curation Strategies for Image Classification
by: Feuer, Benjamin, et al.
Published: (2024) -
DreamScene: 3D Gaussian-based Text-to-3D Scene Generation via Formation Pattern Sampling
by: Li, Haoran, et al.
Published: (2024) -
Beyond Detection: A Structure-Aware Framework for Scene Text Tracking
by: Yu, Chenmin, et al.
Published: (2026)