Saved in:
| Main Authors: | Liu, Ziqian, Alaniz, Stephan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.05180 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence
by: Liu, Chonghan, et al.
Published: (2025)
by: Liu, Chonghan, et al.
Published: (2025)
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
by: Girrbach, Leander, et al.
Published: (2025)
by: Girrbach, Leander, et al.
Published: (2025)
GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing
by: Liu, Mingxin, et al.
Published: (2026)
by: Liu, Mingxin, et al.
Published: (2026)
PRIMEdit: Probability Redistribution for Instance-aware Multi-object Video Editing with Benchmark Dataset
by: Teodoro, Samuel, et al.
Published: (2024)
by: Teodoro, Samuel, et al.
Published: (2024)
MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
by: Benito, Miguel Diaz, et al.
Published: (2026)
by: Benito, Miguel Diaz, et al.
Published: (2026)
MIRAGE: Runtime Scheduling for Multi-Vector Image Retrieval with Hierarchical Decomposition
by: Li, Maoliang, et al.
Published: (2025)
by: Li, Maoliang, et al.
Published: (2025)
Training-Free Multi-Concept Image Editing
by: Foteinopoulou, Niki, et al.
Published: (2026)
by: Foteinopoulou, Niki, et al.
Published: (2026)
Large-scale EM Benchmark for Multi-Organelle Instance Segmentation in the Wild
by: Lu, Yanrui, et al.
Published: (2026)
by: Lu, Yanrui, et al.
Published: (2026)
CAMEO: A Conditional and Quality-Aware Multi-Agent Image Editing Orchestrator
by: Pu, Yuhan, et al.
Published: (2026)
by: Pu, Yuhan, et al.
Published: (2026)
SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions
by: Bader, Jessica, et al.
Published: (2025)
by: Bader, Jessica, et al.
Published: (2025)
GEditBench v2: A Human-Aligned Benchmark for General Image Editing
by: Jiang, Zhangqi, et al.
Published: (2026)
by: Jiang, Zhangqi, et al.
Published: (2026)
MIRAGE: Towards AI-Generated Image Detection in the Wild
by: Xia, Cheng, et al.
Published: (2025)
by: Xia, Cheng, et al.
Published: (2025)
Shifting the Breaking Point of Flow Matching for Multi-Instance Editing
by: Zaccagnino, Carmine, et al.
Published: (2026)
by: Zaccagnino, Carmine, et al.
Published: (2026)
SeedEdit: Align Image Re-Generation to Image Editing
by: Shi, Yichun, et al.
Published: (2024)
by: Shi, Yichun, et al.
Published: (2024)
Human-Aligned MLLM Judges for Fine-Grained Image Editing Evaluation: A Benchmark, Framework, and Analysis
by: Liu, Runzhou, et al.
Published: (2026)
by: Liu, Runzhou, et al.
Published: (2026)
LoFT: LoRA-fused Training Dataset Generation with Few-shot Guidance
by: Kim, Jae Myung, et al.
Published: (2025)
by: Kim, Jae Myung, et al.
Published: (2025)
FLAIR: VLM with Fine-grained Language-informed Image Representations
by: Xiao, Rui, et al.
Published: (2024)
by: Xiao, Rui, et al.
Published: (2024)
TabletopGen: Instance-Level Interactive 3D Tabletop Scene Generation from Text or Single Image
by: Wang, Ziqian, et al.
Published: (2025)
by: Wang, Ziqian, et al.
Published: (2025)
Explaining CLIP Zero-shot Predictions Through Concepts
by: Ozdemir, Onat, et al.
Published: (2026)
by: Ozdemir, Onat, et al.
Published: (2026)
Training-free Uncertainty Guidance for Complex Visual Tasks with MLLMs
by: Kim, Sanghwan, et al.
Published: (2025)
by: Kim, Sanghwan, et al.
Published: (2025)
VTEdit-Bench: A Comprehensive Benchmark for Multi-Reference Image Editing Models in Virtual Try-On
by: Liang, Xiaoye, et al.
Published: (2026)
by: Liang, Xiaoye, et al.
Published: (2026)
Instance-Aligned Captions for Explainable Video Anomaly Detection
by: Song, Inpyo, et al.
Published: (2026)
by: Song, Inpyo, et al.
Published: (2026)
2D Instance Editing in 3D Space
by: Xie, Yuhuan, et al.
Published: (2025)
by: Xie, Yuhuan, et al.
Published: (2025)
VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment
by: Sun, Shangkun, et al.
Published: (2024)
by: Sun, Shangkun, et al.
Published: (2024)
Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing
by: He, Runze, et al.
Published: (2026)
by: He, Runze, et al.
Published: (2026)
EditRefiner: A Human-Aligned Agentic Framework for Image Editing Refinement
by: Xu, Zitong, et al.
Published: (2026)
by: Xu, Zitong, et al.
Published: (2026)
MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM
by: Dong, Bowen, et al.
Published: (2025)
by: Dong, Bowen, et al.
Published: (2025)
MIRAGE: Multimodal foundation model and benchmark for comprehensive retinal OCT image analysis
by: Morano, José, et al.
Published: (2025)
by: Morano, José, et al.
Published: (2025)
Cluster-Level Sparse Multi-Instance Learning for Whole-Slide Images
by: Zhang, Yuedi, et al.
Published: (2025)
by: Zhang, Yuedi, et al.
Published: (2025)
Para-Lane: Multi-Lane Dataset Registering Parallel Scans for Benchmarking Novel View Synthesis
by: Ni, Ziqian, et al.
Published: (2025)
by: Ni, Ziqian, et al.
Published: (2025)
ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing
by: Pan, Yulin, et al.
Published: (2025)
by: Pan, Yulin, et al.
Published: (2025)
Reasoning to Align: Implicit Reasoning in Diffusion Transformers for Video Editing
by: Li, Yan, et al.
Published: (2026)
by: Li, Yan, et al.
Published: (2026)
MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations
by: Dongre, Vardhan, et al.
Published: (2025)
by: Dongre, Vardhan, et al.
Published: (2025)
M$^{3}$T2IBench: A Large-Scale Multi-Category, Multi-Instance, Multi-Relation Text-to-Image Benchmark
by: Zhang, Huixuan, et al.
Published: (2025)
by: Zhang, Huixuan, et al.
Published: (2025)
MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models
by: Zhu, Hongyang, et al.
Published: (2025)
by: Zhu, Hongyang, et al.
Published: (2025)
Multi-turn Consistent Image Editing
by: Zhou, Zijun, et al.
Published: (2025)
by: Zhou, Zijun, et al.
Published: (2025)
MIRAGE: Model-agnostic Industrial Realistic Anomaly Generation and Evaluation for Visual Anomaly Detection
by: Hu, Jinwei, et al.
Published: (2026)
by: Hu, Jinwei, et al.
Published: (2026)
SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite Images
by: Feng, Pengming, et al.
Published: (2024)
by: Feng, Pengming, et al.
Published: (2024)
MIRAGE: Multimodal Identification and Recognition of Annotations in Indian General Prescriptions
by: Mankash, Tavish, et al.
Published: (2024)
by: Mankash, Tavish, et al.
Published: (2024)
FINER: MLLMs Hallucinate under Fine-grained Negative Queries
by: Xiao, Rui, et al.
Published: (2026)
by: Xiao, Rui, et al.
Published: (2026)
Similar Items
-
MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence
by: Liu, Chonghan, et al.
Published: (2025) -
A Large Scale Analysis of Gender Biases in Text-to-Image Generative Models
by: Girrbach, Leander, et al.
Published: (2025) -
GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing
by: Liu, Mingxin, et al.
Published: (2026) -
PRIMEdit: Probability Redistribution for Instance-aware Multi-object Video Editing with Benchmark Dataset
by: Teodoro, Samuel, et al.
Published: (2024) -
MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
by: Benito, Miguel Diaz, et al.
Published: (2026)