Saved in:
| Main Authors: | Sun, Qi, Zhou, Dingju, Zhang, Lina |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.05263 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
by: Na, Kihyun, et al.
Published: (2025)
by: Na, Kihyun, et al.
Published: (2025)
Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
by: Kaliosis, Panagiotis, et al.
Published: (2025)
by: Kaliosis, Panagiotis, et al.
Published: (2025)
EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
by: Zhang, Yancheng, et al.
Published: (2025)
by: Zhang, Yancheng, et al.
Published: (2025)
Nepali Sign Language Characters Recognition: Dataset Development and Deep Learning Approaches
by: Poudel, Birat, et al.
Published: (2025)
by: Poudel, Birat, et al.
Published: (2025)
Evaluating Dataset Watermarking for Fine-tuning Traceability of Customized Diffusion Models: A Comprehensive Benchmark and Removal Approach
by: Wang, Xincheng, et al.
Published: (2025)
by: Wang, Xincheng, et al.
Published: (2025)
PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language
by: Haq, Ijazul, et al.
Published: (2025)
by: Haq, Ijazul, et al.
Published: (2025)
A Computer Vision Pipeline for Individual-Level Behavior Analysis: Benchmarking on the Edinburgh Pig Dataset
by: Yang, Haiyu, et al.
Published: (2025)
by: Yang, Haiyu, et al.
Published: (2025)
Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization
by: Ma, Yuhang, et al.
Published: (2024)
by: Ma, Yuhang, et al.
Published: (2024)
My Body My Choice: Human-Centric Full-Body Anonymization
by: Ciftci, Umur Aybars, et al.
Published: (2024)
by: Ciftci, Umur Aybars, et al.
Published: (2024)
Zero-shot High-fidelity and Pose-controllable Character Animation
by: Zhu, Bingwen, et al.
Published: (2024)
by: Zhu, Bingwen, et al.
Published: (2024)
January Food Benchmark (JFB): A Public Benchmark Dataset and Evaluation Suite for Multimodal Food Analysis
by: Hosseinian, Amir, et al.
Published: (2025)
by: Hosseinian, Amir, et al.
Published: (2025)
ClimateIQA: A New Dataset and Benchmark to Advance Vision-Language Models in Meteorology Anomalies Analysis
by: Chen, Jian, et al.
Published: (2024)
by: Chen, Jian, et al.
Published: (2024)
PinpointQA: A Dataset and Benchmark for Small Object-Centric Spatial Understanding in Indoor Videos
by: Zhou, Zhiyu, et al.
Published: (2026)
by: Zhou, Zhiyu, et al.
Published: (2026)
Beyond Visual Appearances: Privacy-sensitive Objects Identification via Hybrid Graph Reasoning
by: Jiang, Zhuohang, et al.
Published: (2024)
by: Jiang, Zhuohang, et al.
Published: (2024)
Animate Any Character in Any World
by: Wang, Yitong, et al.
Published: (2025)
by: Wang, Yitong, et al.
Published: (2025)
OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
by: Wang, Jiahao, et al.
Published: (2024)
by: Wang, Jiahao, et al.
Published: (2024)
TowerDataset: A Heterogeneous Benchmark for Transmission Corridor Segmentation with a Global-Local Fusion Framework
by: Cui, Xu, et al.
Published: (2026)
by: Cui, Xu, et al.
Published: (2026)
2K-Characters-10K-Stories: A Quality-Gated Stylized Narrative Dataset with Disentangled Control and Sequence Consistency
by: Yin, Xingxi, et al.
Published: (2025)
by: Yin, Xingxi, et al.
Published: (2025)
UniLat3D: Geometry-Appearance Unified Latents for Single-Stage 3D Generation
by: Wu, Guanjun, et al.
Published: (2025)
by: Wu, Guanjun, et al.
Published: (2025)
LOCR: Location-Guided Transformer for Optical Character Recognition
by: Sun, Yu, et al.
Published: (2024)
by: Sun, Yu, et al.
Published: (2024)
STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery
by: Li, Yansheng, et al.
Published: (2024)
by: Li, Yansheng, et al.
Published: (2024)
OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations
by: Kang, Caixin, et al.
Published: (2024)
by: Kang, Caixin, et al.
Published: (2024)
COMIC: Agentic Sketch Comedy Generation
by: Hong, Susung, et al.
Published: (2026)
by: Hong, Susung, et al.
Published: (2026)
Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation
by: Xu, Yijia, et al.
Published: (2026)
by: Xu, Yijia, et al.
Published: (2026)
HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
by: Narasimhaswamy, Supreeth, et al.
Published: (2024)
by: Narasimhaswamy, Supreeth, et al.
Published: (2024)
Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models
by: Go, Sooyeon, et al.
Published: (2024)
by: Go, Sooyeon, et al.
Published: (2024)
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
by: Lu, Sheng, et al.
Published: (2026)
by: Lu, Sheng, et al.
Published: (2026)
Unified and Dynamic Graph for Temporal Character Grouping in Long Videos
by: Shu, Xiujun, et al.
Published: (2023)
by: Shu, Xiujun, et al.
Published: (2023)
Multi-Modal Character Localization and Extraction for Chinese Text Recognition
by: Li, Qilong, et al.
Published: (2026)
by: Li, Qilong, et al.
Published: (2026)
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery
by: Zhou, Hangyu, et al.
Published: (2024)
by: Zhou, Hangyu, et al.
Published: (2024)
LocRef-Diffusion:Tuning-Free Layout and Appearance-Guided Generation
by: Deng, Fan, et al.
Published: (2024)
by: Deng, Fan, et al.
Published: (2024)
MOT FCG++: Enhanced Representation of Spatio-temporal Motion and Appearance Features
by: Fang, Yanzhao
Published: (2024)
by: Fang, Yanzhao
Published: (2024)
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
by: Chen, Dongping, et al.
Published: (2024)
by: Chen, Dongping, et al.
Published: (2024)
CGFformer: Cluster-Guidance Frequency Transformer for Pansharpening
by: Zhou, Zijian, et al.
Published: (2026)
by: Zhou, Zijian, et al.
Published: (2026)
PhyVLLM: Physics-Guided Video Language Model with Motion-Appearance Disentanglement
by: Zhan, Yu-Wei, et al.
Published: (2025)
by: Zhan, Yu-Wei, et al.
Published: (2025)
SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis
by: Zhao, Shuxian, et al.
Published: (2026)
by: Zhao, Shuxian, et al.
Published: (2026)
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
by: Ma, Zehong, et al.
Published: (2025)
by: Ma, Zehong, et al.
Published: (2025)
Exploring Cross-Domain Few-Shot Classification via Frequency-Aware Prompting
by: Zhang, Tiange, et al.
Published: (2024)
by: Zhang, Tiange, et al.
Published: (2024)
AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance
by: Wang, Zhao, et al.
Published: (2025)
by: Wang, Zhao, et al.
Published: (2025)
Deep Learning in Dental Image Analysis: A Systematic Review of Datasets, Methodologies, and Emerging Challenges
by: Zhou, Zhenhuan, et al.
Published: (2025)
by: Zhou, Zhenhuan, et al.
Published: (2025)
Similar Items
-
CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
by: Na, Kihyun, et al.
Published: (2025) -
Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition
by: Kaliosis, Panagiotis, et al.
Published: (2025) -
EGGS: Exchangeable 2D/3D Gaussian Splatting for Geometry-Appearance Balanced Novel View Synthesis
by: Zhang, Yancheng, et al.
Published: (2025) -
Nepali Sign Language Characters Recognition: Dataset Development and Deep Learning Approaches
by: Poudel, Birat, et al.
Published: (2025) -
Evaluating Dataset Watermarking for Fine-tuning Traceability of Customized Diffusion Models: A Comprehensive Benchmark and Removal Approach
by: Wang, Xincheng, et al.
Published: (2025)