Saved in:
| Main Authors: | Sun, Lixu, Yolwas, Nurmemet, Silamu, Wushour |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.08133 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Lightweight Context-Driven Training-Free Network for Scene Text Segmentation and Recognition
by: Chakraborty, Ritabrata, et al.
Published: (2025)
by: Chakraborty, Ritabrata, et al.
Published: (2025)
EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition
by: Wang, Xiao, et al.
Published: (2025)
by: Wang, Xiao, et al.
Published: (2025)
JSTR: Judgment Improves Scene Text Recognition
by: Fujitake, Masato
Published: (2024)
by: Fujitake, Masato
Published: (2024)
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
by: Maracani, Andrea, et al.
Published: (2025)
by: Maracani, Andrea, et al.
Published: (2025)
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
by: Yu, Xinyao, et al.
Published: (2024)
by: Yu, Xinyao, et al.
Published: (2024)
TextMamba: Scene Text Detector with Mamba
by: Zhao, Qiyan, et al.
Published: (2025)
by: Zhao, Qiyan, et al.
Published: (2025)
Policy Optimized Text-to-Image Pipeline Design
by: Gadot, Uri, et al.
Published: (2025)
by: Gadot, Uri, et al.
Published: (2025)
TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes
by: Fu, Yanping, et al.
Published: (2024)
by: Fu, Yanping, et al.
Published: (2024)
AutoMR: A Universal Time Series Motion Recognition Pipeline
by: Zhang, Likun, et al.
Published: (2025)
by: Zhang, Likun, et al.
Published: (2025)
DreamText: High Fidelity Scene Text Synthesis
by: Wang, Yibin, et al.
Published: (2024)
by: Wang, Yibin, et al.
Published: (2024)
Fast Real-Time Pipeline for Robust Arm Gesture Recognition
by: Bagladi, Milán Zsolt, et al.
Published: (2025)
by: Bagladi, Milán Zsolt, et al.
Published: (2025)
Think Before You Drive: World Model-Inspired Multimodal Grounding for Autonomous Vehicles
by: Liao, Haicheng, et al.
Published: (2025)
by: Liao, Haicheng, et al.
Published: (2025)
Text-Scene: A Scene-to-Language Parsing Framework for 3D Scene Understanding
by: Li, Haoyuan, et al.
Published: (2025)
by: Li, Haoyuan, et al.
Published: (2025)
Handwritten Text Recognition: A Survey
by: Garrido-Munoz, Carlos, et al.
Published: (2025)
by: Garrido-Munoz, Carlos, et al.
Published: (2025)
TAG: Thinking with Action Unit Grounding for Facial Expression Recognition
by: Lin, Haobo, et al.
Published: (2026)
by: Lin, Haobo, et al.
Published: (2026)
Bharat Scene Text: A Novel Comprehensive Dataset and Benchmark for Indian Language Scene Text Understanding
by: De, Anik, et al.
Published: (2025)
by: De, Anik, et al.
Published: (2025)
StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting
by: Simonyan, Aleksandr, et al.
Published: (2026)
by: Simonyan, Aleksandr, et al.
Published: (2026)
Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models
by: Zhan, Yufei, et al.
Published: (2023)
by: Zhan, Yufei, et al.
Published: (2023)
DualTSR: Unified Dual-Diffusion Transformer for Scene Text Image Super-Resolution
by: Niu, Axi, et al.
Published: (2026)
by: Niu, Axi, et al.
Published: (2026)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation
by: Zhang, Frank, et al.
Published: (2024)
by: Zhang, Frank, et al.
Published: (2024)
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
by: Tang, Tao, et al.
Published: (2024)
by: Tang, Tao, et al.
Published: (2024)
JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding
by: Maeda, Koki, et al.
Published: (2026)
by: Maeda, Koki, et al.
Published: (2026)
ESTR-CoT: Towards Explainable and Accurate Event Stream based Scene Text Recognition with Chain-of-Thought Reasoning
by: Wang, Xiao, et al.
Published: (2025)
by: Wang, Xiao, et al.
Published: (2025)
DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
by: Xie, Yu, et al.
Published: (2024)
by: Xie, Yu, et al.
Published: (2024)
PaintScene4D: Consistent 4D Scene Generation from Text Prompts
by: Gupta, Vinayak, et al.
Published: (2024)
by: Gupta, Vinayak, et al.
Published: (2024)
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting
by: Zhou, Shijie, et al.
Published: (2024)
by: Zhou, Shijie, et al.
Published: (2024)
InstructOCR: Instruction Boosting Scene Text Spotting
by: Duan, Chen, et al.
Published: (2024)
by: Duan, Chen, et al.
Published: (2024)
Text-VQA Aug: Pipelined Harnessing of Large Multimodal Models for Automated Synthesis
by: Joshi, Soham, et al.
Published: (2025)
by: Joshi, Soham, et al.
Published: (2025)
Integrating Prior Observations for Incremental 3D Scene Graph Prediction
by: Renz, Marian, et al.
Published: (2025)
by: Renz, Marian, et al.
Published: (2025)
Re-Thinking the Automatic Evaluation of Image-Text Alignment in Text-to-Image Models
by: Zhang, Huixuan, et al.
Published: (2025)
by: Zhang, Huixuan, et al.
Published: (2025)
Seeing Beyond the Scene: Analyzing and Mitigating Background Bias in Action Recognition
by: Zhou, Ellie, et al.
Published: (2025)
by: Zhou, Ellie, et al.
Published: (2025)
CURVE: Learning Causality-Inspired Invariant Representations for Robust Scene Understanding via Uncertainty-Guided Regularization
by: Liang, Yue, et al.
Published: (2026)
by: Liang, Yue, et al.
Published: (2026)
Jailbreaking on Text-to-Video Models via Scene Splitting Strategy
by: Lee, Wonjun, et al.
Published: (2025)
by: Lee, Wonjun, et al.
Published: (2025)
TSTMotion: Training-free Scene-aware Text-to-motion Generation
by: Guo, Ziyan, et al.
Published: (2025)
by: Guo, Ziyan, et al.
Published: (2025)
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
by: Lin, Xingtao, et al.
Published: (2024)
by: Lin, Xingtao, et al.
Published: (2024)
A Large-scale Dataset for Robust Complex Anime Scene Text Detection
by: Dong, Ziyi, et al.
Published: (2025)
by: Dong, Ziyi, et al.
Published: (2025)
TripleFDS: Triple Feature Disentanglement and Synthesis for Scene Text Editing
by: Bao, Yuchen, et al.
Published: (2025)
by: Bao, Yuchen, et al.
Published: (2025)
Hybrid CNN-ViT Framework for Motion-Blurred Scene Text Restoration
by: Rashid, Umar, et al.
Published: (2025)
by: Rashid, Umar, et al.
Published: (2025)
LatentEditor: Text Driven Local Editing of 3D Scenes
by: Khalid, Umar, et al.
Published: (2023)
by: Khalid, Umar, et al.
Published: (2023)
Show Me the World in My Language: Establishing the First Baseline for Scene-Text to Scene-Text Translation
by: Vaidya, Shreyas, et al.
Published: (2023)
by: Vaidya, Shreyas, et al.
Published: (2023)
Similar Items
-
A Lightweight Context-Driven Training-Free Network for Scene Text Segmentation and Recognition
by: Chakraborty, Ritabrata, et al.
Published: (2025) -
EventSTR: A Benchmark Dataset and Baselines for Event Stream based Scene Text Recognition
by: Wang, Xiao, et al.
Published: (2025) -
JSTR: Judgment Improves Scene Text Recognition
by: Fujitake, Masato
Published: (2024) -
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
by: Maracani, Andrea, et al.
Published: (2025) -
Memory-Inspired Temporal Prompt Interaction for Text-Image Classification
by: Yu, Xinyao, et al.
Published: (2024)