:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hosseinian, Amir, Zahedani, Ashkan Dehghani, Mansoor, Umer, Hashemi, Noosheen, Woodward, Mark
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2508.09966
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Wake Vision: A Tailored Dataset and Benchmark Suite for TinyML Computer Vision Applications
by: Banbury, Colby, et al.
Published: (2024)

CrossView Suite: Harnessing Cross-view Spatial Intelligence of MLLMs with Dataset, Model and Benchmark
by: Wang, Wei, et al.
Published: (2026)

FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images
by: Ishraq, Sabab, et al.
Published: (2026)

A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
by: Liu, Xiang, et al.
Published: (2025)

SFOOD: A Multimodal Benchmark for Comprehensive Food Attribute Analysis Beyond RGB with Spectral Insights
by: Xu, Zhenbo, et al.
Published: (2025)

Benchmarking Large Vision-Language Models on CFMME: A Comprehensive Chinese Financial Multimodal Evaluation Dataset
by: Chen, Qian, et al.
Published: (2026)

EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
by: Li, Runjia, et al.
Published: (2025)

Seizure-Semiology-Suite (S3): A Clinically Multimodal Dataset, Benchmark, and Models for Seizure Semiology Understanding
by: Zhang, Lina, et al.
Published: (2026)

MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance
by: Zhao, Kaikai, et al.
Published: (2025)

AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform
by: Hashemi, Amirreza
Published: (2023)

Gastric-X: A Multimodal Multi-Phase Benchmark Dataset for Advancing Vision-Language Models in Gastric Cancer Analysis
by: Lu, Sheng, et al.
Published: (2026)

A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning
by: Jiang, Siyang, et al.
Published: (2025)

Towards Unified Multimodal Misinformation Detection in Social Media: A Benchmark Dataset and Baseline
by: Li, Haiyang, et al.
Published: (2025)

Thought-For-Food: Reasoning Chain Induced Food Visual Question Answering
by: Jain, Riddhi, et al.
Published: (2025)

Tangram: Benchmark for Evaluating Geometric Element Recognition in Large Multimodal Models
by: Zhang, Chao, et al.
Published: (2024)

BenchSeg: A Large-Scale Dataset and Benchmark for Multi-View Food Video Segmentation
by: AlMughrabi, Ahmad, et al.
Published: (2026)

Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) Algorithms
by: Chauvin, Lucian, et al.
Published: (2025)

An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models
by: Guruprasad, Pranav, et al.
Published: (2025)

UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite
by: Guo, Sicen, et al.
Published: (2023)

BadmintonGRF: A Multimodal Dataset and Benchmark for Markerless Ground Reaction Force Estimation in Badminton
by: Niu, Kuoye, et al.
Published: (2026)

SurgMLLMBench: A Multimodal Large Language Model Benchmark Dataset for Surgical Scene Understanding
by: Choi, Tae-Min, et al.
Published: (2025)

MOSABench: Multi-Object Sentiment Analysis Benchmark for Evaluating Multimodal Large Language Models Understanding of Complex Image
by: Song, Shezheng, et al.
Published: (2024)

MMR-AD: A Large-Scale Multimodal Dataset for Benchmarking General Anomaly Detection with Multimodal Large Language Models
by: Yao, Xincheng, et al.
Published: (2026)

ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
by: Rawte, Vipula, et al.
Published: (2024)

CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs
by: Jian, Ai, et al.
Published: (2025)

MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation
by: Yao, Jihan, et al.
Published: (2025)

From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios
by: Liu, Guoshan, et al.
Published: (2024)

TRIP-Evaluate: An Open Multimodal Benchmark for Evaluating Large Models in Transportation
by: Gong, Han, et al.
Published: (2026)

MM-Food-100K: A 100,000-Sample Multimodal Food Intelligence Dataset with Verifiable Provenance
by: Dong, Yi, et al.
Published: (2025)

Leveraging Automatic Personalised Nutrition: Food Image Recognition Benchmark and Dataset based on Nutrition Taxonomy
by: Romero-Tapiador, Sergio, et al.
Published: (2022)

MM-NeuroOnco: A Multimodal Benchmark and Instruction Dataset for MRI-Based Brain Tumor Diagnosis
by: Guo, Feng, et al.
Published: (2026)

Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
by: Zhou, Yue, et al.
Published: (2025)

GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding
by: Chen, Dongping, et al.
Published: (2024)

MMS-VPR: Multimodal Street-Level Visual Place Recognition Dataset and Benchmark
by: Ou, Yiwei, et al.
Published: (2025)

Benchmarking Post-Hoc Unknown-Category Detection in Food Recognition
by: Rahman, Lubnaa Abdur, et al.
Published: (2025)

MME-Emotion: A Holistic Evaluation Benchmark for Emotional Intelligence in Multimodal Large Language Models
by: Zhang, Fan, et al.
Published: (2025)

Real-Time Feedback and Benchmark Dataset for Isometric Pose Evaluation
by: Jaiswal, Abhishek, et al.
Published: (2025)

NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification
by: Song, Ziyang, et al.
Published: (2025)

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation
by: Huang, Jinsheng, et al.
Published: (2024)

Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
by: Patil, Vaidehi, et al.
Published: (2025)