Saved in:
| Main Authors: | Yang, Zhichao, Gu, Tianjiao, Wang, Jianjie, Lin, Feiyu, Sheng, Xiangfei, Chen, Pengfei, Li, Leida |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.09271 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
by: Yang, Zhichao, et al.
Published: (2026)
by: Yang, Zhichao, et al.
Published: (2026)
Fine-grained Image Quality Assessment for Perceptual Image Restoration
by: Sheng, Xiangfei, et al.
Published: (2025)
by: Sheng, Xiangfei, et al.
Published: (2025)
AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024)
by: Huang, Yipo, et al.
Published: (2024)
TuningIQA: Fine-Grained Blind Image Quality Assessment for Livestreaming Camera Tuning
by: Sheng, Xiangfei, et al.
Published: (2025)
by: Sheng, Xiangfei, et al.
Published: (2025)
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024)
by: Huang, Yipo, et al.
Published: (2024)
Language-Guided Visual Perception Disentanglement for Image Quality Assessment and Conditional Image Generation
by: Yang, Zhichao, et al.
Published: (2025)
by: Yang, Zhichao, et al.
Published: (2025)
M$^{3}$T2IBench: A Large-Scale Multi-Category, Multi-Instance, Multi-Relation Text-to-Image Benchmark
by: Zhang, Huixuan, et al.
Published: (2025)
by: Zhang, Huixuan, et al.
Published: (2025)
The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection
by: Cao, Tianjiao, et al.
Published: (2025)
by: Cao, Tianjiao, et al.
Published: (2025)
FTII-Bench: A Comprehensive Multimodal Benchmark for Flow Text with Image Insertion
by: Ruan, Jiacheng, et al.
Published: (2024)
by: Ruan, Jiacheng, et al.
Published: (2024)
SLVMEval: Synthetic Meta Evaluation Benchmark for Text-to-Long Video Generation
by: Matsuda, Ryosuke, et al.
Published: (2026)
by: Matsuda, Ryosuke, et al.
Published: (2026)
Skill-Aligned Annotation for Reliable Evaluation in Text-to-Image Generation
by: Eldesokey, Abdelrahman, et al.
Published: (2026)
by: Eldesokey, Abdelrahman, et al.
Published: (2026)
Long-Text-to-Image Generation via Compositional Prompt Decomposition
by: Huang, Jen-Yuan, et al.
Published: (2026)
by: Huang, Jen-Yuan, et al.
Published: (2026)
HyperAlign: Hyperbolic Entailment Cones for Adaptive Text-to-Image Alignment Assessment
by: Chen, Wenzhi, et al.
Published: (2026)
by: Chen, Wenzhi, et al.
Published: (2026)
Content-Adaptive Image Retouching Guided by Attribute-Based Text Representation
by: Zhu, Hancheng, et al.
Published: (2025)
by: Zhu, Hancheng, et al.
Published: (2025)
Text to Image Generation and Editing: A Survey
by: Yang, Pengfei, et al.
Published: (2025)
by: Yang, Pengfei, et al.
Published: (2025)
TIT-Score: Evaluating Long-Prompt Based Text-to-Image Alignment via Text-to-Image-to-Text Consistency
by: Wang, Juntong, et al.
Published: (2025)
by: Wang, Juntong, et al.
Published: (2025)
Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models
by: Wang, Alex Jinpeng, et al.
Published: (2025)
by: Wang, Alex Jinpeng, et al.
Published: (2025)
LongInsightBench: A Comprehensive Benchmark for Evaluating Omni-Modal Models on Human-Centric Long-Video Understanding
by: Han, ZhaoYang, et al.
Published: (2025)
by: Han, ZhaoYang, et al.
Published: (2025)
SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance
by: Shen, Guibao, et al.
Published: (2024)
by: Shen, Guibao, et al.
Published: (2024)
MTV-Inpaint: Multi-Task Long Video Inpainting
by: Yang, Shiyuan, et al.
Published: (2025)
by: Yang, Shiyuan, et al.
Published: (2025)
TextVidBench: A Benchmark for Long Video Scene Text Understanding
by: Zhong, Yangyang, et al.
Published: (2025)
by: Zhong, Yangyang, et al.
Published: (2025)
Knowledge Visualization: A Benchmark and Method for Knowledge-Intensive Text-to-Image Generation
by: Zhao, Ran, et al.
Published: (2026)
by: Zhao, Ran, et al.
Published: (2026)
NYC-Indoor-VPR: A Long-Term Indoor Visual Place Recognition Dataset with Semi-Automatic Annotation
by: Sheng, Diwei, et al.
Published: (2024)
by: Sheng, Diwei, et al.
Published: (2024)
Long Context Tuning for Video Generation
by: Guo, Yuwei, et al.
Published: (2025)
by: Guo, Yuwei, et al.
Published: (2025)
Deep Shape-Texture Statistics for Completely Blind Image Quality Evaluation
by: Li, Yixuan, et al.
Published: (2024)
by: Li, Yixuan, et al.
Published: (2024)
Text-guided Foundation Model Adaptation for Long-Tailed Medical Image Classification
by: Li, Sirui, et al.
Published: (2024)
by: Li, Sirui, et al.
Published: (2024)
VISTAR:A User-Centric and Role-Driven Benchmark for Text-to-Image Evaluation
by: Jiang, Kaiyuan, et al.
Published: (2025)
by: Jiang, Kaiyuan, et al.
Published: (2025)
EvalMuse-40K: A Reliable and Fine-Grained Benchmark with Comprehensive Human Annotations for Text-to-Image Generation Model Evaluation
by: Han, Shuhao, et al.
Published: (2024)
by: Han, Shuhao, et al.
Published: (2024)
AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity
by: Xia, Jili, et al.
Published: (2024)
by: Xia, Jili, et al.
Published: (2024)
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
by: Lei, Jiayi, et al.
Published: (2025)
by: Lei, Jiayi, et al.
Published: (2025)
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
by: Zhou, Pengfei, et al.
Published: (2024)
by: Zhou, Pengfei, et al.
Published: (2024)
SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
by: Liu, Jiawei, et al.
Published: (2025)
by: Liu, Jiawei, et al.
Published: (2025)
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
by: Wang, Jiarui, et al.
Published: (2025)
by: Wang, Jiarui, et al.
Published: (2025)
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)
Long-CLIP: Unlocking the Long-Text Capability of CLIP
by: Zhang, Beichen, et al.
Published: (2024)
by: Zhang, Beichen, et al.
Published: (2024)
Text-Guided Mixup Towards Long-Tailed Image Categorization
by: Franklin, Richard, et al.
Published: (2024)
by: Franklin, Richard, et al.
Published: (2024)
Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method
by: Song, Xinshuai, et al.
Published: (2024)
by: Song, Xinshuai, et al.
Published: (2024)
MONICA: Benchmarking on Long-tailed Medical Image Classification
by: Ju, Lie, et al.
Published: (2024)
by: Ju, Lie, et al.
Published: (2024)
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation
by: Wang, Yibin, et al.
Published: (2025)
by: Wang, Yibin, et al.
Published: (2025)
GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation Models
by: Butt, Muhammad Atif, et al.
Published: (2025)
by: Butt, Muhammad Atif, et al.
Published: (2025)
Similar Items
-
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks
by: Yang, Zhichao, et al.
Published: (2026) -
Fine-grained Image Quality Assessment for Perceptual Image Restoration
by: Sheng, Xiangfei, et al.
Published: (2025) -
AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024) -
TuningIQA: Fine-Grained Blind Image Quality Assessment for Livestreaming Camera Tuning
by: Sheng, Xiangfei, et al.
Published: (2025) -
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024)