Saved in:
| Main Authors: | Zhao, Xiangyu, Ding, Shengyuan, Zhang, Zicheng, Huang, Haian, Cao, Maosong, Wang, Weiyun, Wang, Jiaqi, Fang, Xinyu, Wang, Wenhai, Zhai, Guangtao, Duan, Haodong, Yang, Hua, Chen, Kai |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.18411 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Redundancy Principles for MLLMs Benchmarks
by: Zhang, Zicheng, et al.
Published: (2025)
by: Zhang, Zicheng, et al.
Published: (2025)
GOBench: Benchmarking Geometric Optics Generation and Understanding of MLLMs
by: Zhu, Xiaorong, et al.
Published: (2025)
by: Zhu, Xiaorong, et al.
Published: (2025)
Affordance Benchmark for MLLMs
by: Wang, Junying, et al.
Published: (2025)
by: Wang, Junying, et al.
Published: (2025)
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
by: Zhao, Xiangyu, et al.
Published: (2024)
by: Zhao, Xiangyu, et al.
Published: (2024)
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning
by: Ren, Yiming, et al.
Published: (2025)
by: Ren, Yiming, et al.
Published: (2025)
Image Quality Assessment: From Human to Machine Preference
by: Li, Chunyi, et al.
Published: (2025)
by: Li, Chunyi, et al.
Published: (2025)
Fine-Grained GRPO for Precise Preference Alignment in Flow Models
by: Zhou, Yujie, et al.
Published: (2025)
by: Zhou, Yujie, et al.
Published: (2025)
From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities
by: Jiang, Shixin, et al.
Published: (2024)
by: Jiang, Shixin, et al.
Published: (2024)
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
by: Wang, Yi, et al.
Published: (2025)
by: Wang, Yi, et al.
Published: (2025)
Explore the Hallucination on Low-level Perception for MLLMs
by: Sun, Yinan, et al.
Published: (2024)
by: Sun, Yinan, et al.
Published: (2024)
MirrorBench: Evaluating Self-centric Intelligence in MLLMs by Introducing a Mirror
by: Guo, Shengyu, et al.
Published: (2026)
by: Guo, Shengyu, et al.
Published: (2026)
AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
by: Li, Yunhao, et al.
Published: (2025)
by: Li, Yunhao, et al.
Published: (2025)
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM
by: Fang, Xinyu, et al.
Published: (2025)
by: Fang, Xinyu, et al.
Published: (2025)
OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems
by: Li, Xiaozhe, et al.
Published: (2025)
by: Li, Xiaozhe, et al.
Published: (2025)
NP-Engine: Empowering Optimization Reasoning in Large Language Models with Verifiable Synthetic NP Problems
by: Li, Xiaozhe, et al.
Published: (2025)
by: Li, Xiaozhe, et al.
Published: (2025)
OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces
by: Li, Xiaozhe, et al.
Published: (2026)
by: Li, Xiaozhe, et al.
Published: (2026)
DHQA-4D: Perceptual Quality Assessment of Dynamic 4D Digital Human
by: Li, Yunhao, et al.
Published: (2025)
by: Li, Yunhao, et al.
Published: (2025)
MM-IFEngine: Towards Multimodal Instruction Following
by: Ding, Shengyuan, et al.
Published: (2025)
by: Ding, Shengyuan, et al.
Published: (2025)
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning
by: Ding, Shengyuan, et al.
Published: (2025)
by: Ding, Shengyuan, et al.
Published: (2025)
Find Them All: Unveiling MLLMs for Versatile Person Re-identification
by: Li, Jinhao, et al.
Published: (2025)
by: Li, Jinhao, et al.
Published: (2025)
MedOmni-45°: A Safety-Performance Benchmark for Reasoning-Oriented LLMs in Medicine
by: Ji, Kaiyuan, et al.
Published: (2025)
by: Ji, Kaiyuan, et al.
Published: (2025)
SPARK: Synergistic Policy And Reward Co-Evolving Framework
by: Liu, Ziyu, et al.
Published: (2025)
by: Liu, Ziyu, et al.
Published: (2025)
Information Density Principle for MLLM Benchmarks
by: Li, Chunyi, et al.
Published: (2025)
by: Li, Chunyi, et al.
Published: (2025)
Forge: Quality-Aware Reinforcement Learning for NP-Hard Optimization in LLMs
by: Li, Xiaozhe, et al.
Published: (2026)
by: Li, Xiaozhe, et al.
Published: (2026)
LM Fight Arena: Benchmarking Large Multimodal Models via Game Competition
by: Zheng, Yushuo, et al.
Published: (2025)
by: Zheng, Yushuo, et al.
Published: (2025)
AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference
by: Han, Yang, et al.
Published: (2024)
by: Han, Yang, et al.
Published: (2024)
TIT-Score: Evaluating Long-Prompt Based Text-to-Image Alignment via Text-to-Image-to-Text Consistency
by: Wang, Juntong, et al.
Published: (2025)
by: Wang, Juntong, et al.
Published: (2025)
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing
by: Zhao, Xiangyu, et al.
Published: (2025)
by: Zhao, Xiangyu, et al.
Published: (2025)
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
by: Cao, Maosong, et al.
Published: (2025)
by: Cao, Maosong, et al.
Published: (2025)
Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model
by: Fu, Kang, et al.
Published: (2025)
by: Fu, Kang, et al.
Published: (2025)
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)
by: Yu, Tianyu, et al.
Published: (2023)
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
by: Li, Caorui, et al.
Published: (2025)
by: Li, Caorui, et al.
Published: (2025)
Preference-Guided Debiasing for No-Reference Enhancement Image Quality Assessment
by: Gao, Shiqi, et al.
Published: (2026)
by: Gao, Shiqi, et al.
Published: (2026)
Omni$^2$: Unifying Omnidirectional Image Generation and Editing in an Omni Model
by: Yang, Liu, et al.
Published: (2025)
by: Yang, Liu, et al.
Published: (2025)
Human-Centric Evaluation for Foundation Models
by: Guo, Yijin, et al.
Published: (2025)
by: Guo, Yijin, et al.
Published: (2025)
Human Cognitive Benchmarks Reveal Foundational Visual Gaps in MLLMs
by: Huang, Jen-Tse, et al.
Published: (2025)
by: Huang, Jen-Tse, et al.
Published: (2025)
Subjective-Aligned Dataset and Metric for Text-to-Video Quality Assessment
by: Kou, Tengchuan, et al.
Published: (2024)
by: Kou, Tengchuan, et al.
Published: (2024)
Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition
by: Zheng, Yushuo, et al.
Published: (2026)
by: Zheng, Yushuo, et al.
Published: (2026)
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
by: Wang, Weiyun, et al.
Published: (2025)
by: Wang, Weiyun, et al.
Published: (2025)
Omni-RRM: Advancing Omni Reward Modeling via Automatic Rubric-Grounded Preference Synthesis
by: Kong, Zicheng, et al.
Published: (2026)
by: Kong, Zicheng, et al.
Published: (2026)
Similar Items
-
Redundancy Principles for MLLMs Benchmarks
by: Zhang, Zicheng, et al.
Published: (2025) -
GOBench: Benchmarking Geometric Optics Generation and Understanding of MLLMs
by: Zhu, Xiaorong, et al.
Published: (2025) -
Affordance Benchmark for MLLMs
by: Wang, Junying, et al.
Published: (2025) -
MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning
by: Zhao, Xiangyu, et al.
Published: (2024) -
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning
by: Ren, Yiming, et al.
Published: (2025)