Saved in:
| Main Authors: | Hosseinian, Amir, Zahedani, Ashkan Dehghani, Mansoor, Umer, Hashemi, Noosheen, Woodward, Mark |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.09966 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications
by: Banbury, Colby, et al.
Published: (2024)
by: Banbury, Colby, et al.
Published: (2024)
CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark
by: Wang, Wei, et al.
Published: (2026)
by: Wang, Wei, et al.
Published: (2026)
FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images
by: Ishraq, Sabab, et al.
Published: (2026)
by: Ishraq, Sabab, et al.
Published: (2026)
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
by: Liu, Xiang, et al.
Published: (2025)
by: Liu, Xiang, et al.
Published: (2025)
SFOOD: A Multimodal Benchmark for Comprehensive Food Attribute Analysis Beyond RGB with Spectral Insights
by: Xu, Zhenbo, et al.
Published: (2025)
by: Xu, Zhenbo, et al.
Published: (2025)
Benchmarking Large Vision-Language Models on CFMME: A Comprehensive Chinese Financial Multimodal Evaluation Dataset
by: Chen, Qian, et al.
Published: (2026)
by: Chen, Qian, et al.
Published: (2026)
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
by: Li, Runjia, et al.
Published: (2025)
by: Li, Runjia, et al.
Published: (2025)
Seizure-Semiology-Suite (S3): A Clinically Multimodal Dataset, Benchmark, and Models for Seizure Semiology Understanding
by: Zhang, Lina, et al.
Published: (2026)
by: Zhang, Lina, et al.
Published: (2026)
MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance
by: Zhao, Kaikai, et al.
Published: (2025)
by: Zhao, Kaikai, et al.
Published: (2025)
AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform
by: Hashemi, Amirreza
Published: (2023)
by: Hashemi, Amirreza
Published: (2023)
Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
by: Lu, Sheng, et al.
Published: (2026)
by: Lu, Sheng, et al.
Published: (2026)
A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
by: Jiang, Siyang, et al.
Published: (2025)
by: Jiang, Siyang, et al.
Published: (2025)
Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline
by: Li, Haiyang, et al.
Published: (2025)
by: Li, Haiyang, et al.
Published: (2025)
Thought-For-Food: Reasoning Chain Induced Food Visual Question Answering
by: Jain, Riddhi, et al.
Published: (2025)
by: Jain, Riddhi, et al.
Published: (2025)
Tangram: Benchmark for Evaluating Geometric Element Recognition in Large Multimodal Models
by: Zhang, Chao, et al.
Published: (2024)
by: Zhang, Chao, et al.
Published: (2024)
BenchSeg: A Large-Scale Dataset and Benchmark for Multi-View Food Video Segmentation
by: AlMughrabi, Ahmad, et al.
Published: (2026)
by: AlMughrabi, Ahmad, et al.
Published: (2026)
Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) Algorithms
by: Chauvin, Lucian, et al.
Published: (2025)
by: Chauvin, Lucian, et al.
Published: (2025)
An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models
by: Guruprasad, Pranav, et al.
Published: (2025)
by: Guruprasad, Pranav, et al.
Published: (2025)
UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite
by: Guo, Sicen, et al.
Published: (2023)
by: Guo, Sicen, et al.
Published: (2023)
BadmintonGRF: A Multimodal Dataset and Benchmark for Markerless Ground Reaction Force Estimation in Badminton
by: Niu, Kuoye, et al.
Published: (2026)
by: Niu, Kuoye, et al.
Published: (2026)
SurgMLLMBench: A Multimodal Large Language Model Benchmark Dataset for Surgical Scene Understanding
by: Choi, Tae-Min, et al.
Published: (2025)
by: Choi, Tae-Min, et al.
Published: (2025)
MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
by: Song, Shezheng, et al.
Published: (2024)
by: Song, Shezheng, et al.
Published: (2024)
MMR-AD: A Large-Scale Multimodal Dataset for Benchmarking General Anomaly Detection with Multimodal Large Language Models
by: Yao, Xincheng, et al.
Published: (2026)
by: Yao, Xincheng, et al.
Published: (2026)
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
by: Rawte, Vipula, et al.
Published: (2024)
by: Rawte, Vipula, et al.
Published: (2024)
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs
by: Jian, Ai, et al.
Published: (2025)
by: Jian, Ai, et al.
Published: (2025)
MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation
by: Yao, Jihan, et al.
Published: (2025)
by: Yao, Jihan, et al.
Published: (2025)
From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios
by: Liu, Guoshan, et al.
Published: (2024)
by: Liu, Guoshan, et al.
Published: (2024)
TRIP-Evaluate: An Open Multimodal Benchmark for Evaluating Large Models in Transportation
by: Gong, Han, et al.
Published: (2026)
by: Gong, Han, et al.
Published: (2026)
MM-Food-100K: A 100,000-Sample Multimodal Food Intelligence Dataset with Verifiable Provenance
by: Dong, Yi, et al.
Published: (2025)
by: Dong, Yi, et al.
Published: (2025)
Leveraging Automatic Personalised Nutrition: Food Image Recognition Benchmark and Dataset based on Nutrition Taxonomy
by: Romero-Tapiador, Sergio, et al.
Published: (2022)
by: Romero-Tapiador, Sergio, et al.
Published: (2022)
MM-NeuroOnco: A Multimodal Benchmark and Instruction Dataset for MRI-Based Brain Tumor Diagnosis
by: Guo, Feng, et al.
Published: (2026)
by: Guo, Feng, et al.
Published: (2026)
Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
by: Zhou, Yue, et al.
Published: (2025)
by: Zhou, Yue, et al.
Published: (2025)
GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
by: Chen, Dongping, et al.
Published: (2024)
by: Chen, Dongping, et al.
Published: (2024)
MMS-VPR: Multimodal Street-Level Visual Place Recognition Dataset and Benchmark
by: Ou, Yiwei, et al.
Published: (2025)
by: Ou, Yiwei, et al.
Published: (2025)
Benchmarking Post-Hoc Unknown-Category Detection in Food Recognition
by: Rahman, Lubnaa Abdur, et al.
Published: (2025)
by: Rahman, Lubnaa Abdur, et al.
Published: (2025)
MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
by: Zhang, Fan, et al.
Published: (2025)
by: Zhang, Fan, et al.
Published: (2025)
Real-Time Feedback and Benchmark Dataset for Isometric Pose Evaluation
by: Jaiswal, Abhishek, et al.
Published: (2025)
by: Jaiswal, Abhishek, et al.
Published: (2025)
NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
by: Song, Ziyang, et al.
Published: (2025)
by: Song, Ziyang, et al.
Published: (2025)
MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
by: Huang, Jinsheng, et al.
Published: (2024)
by: Huang, Jinsheng, et al.
Published: (2024)
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
by: Patil, Vaidehi, et al.
Published: (2025)
by: Patil, Vaidehi, et al.
Published: (2025)
Similar Items
-
Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications
by: Banbury, Colby, et al.
Published: (2024) -
CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark
by: Wang, Wei, et al.
Published: (2026) -
FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images
by: Ishraq, Sabab, et al.
Published: (2026) -
A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
by: Liu, Xiang, et al.
Published: (2025) -
SFOOD: A Multimodal Benchmark for Comprehensive Food Attribute Analysis Beyond RGB with Spectral Insights
by: Xu, Zhenbo, et al.
Published: (2025)