:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Heap, Thomas, Aitchison, Laurence, Cahill, Emma, Rodriguez, Adriana Casado
Formato:	Preprint
Publicado:	2026
Materias:	Computer Vision and Pattern Recognition Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2602.18540
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

APML: Adaptive Probabilistic Matching Loss for Robust 3D Point Cloud Reconstruction
por: Sharifipour, Sasan, et al.
Publicado: (2025)

PushupBench: Your VLM is not good at counting pushups
por: Li, Shengzhi, et al.
Publicado: (2026)

Video-Bench: Human-Aligned Video Generation Benchmark
por: Han, Hui, et al.
Publicado: (2025)

VEU-Bench: Towards Comprehensive Understanding of Video Editing
por: Li, Bozheng, et al.
Publicado: (2025)

PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
por: Ouyang, Kun, et al.
Publicado: (2024)

Seeing the Big Picture: Evaluating Multimodal LLMs' Ability to Interpret and Grade Handwritten Student Work
por: Henkel, Owen, et al.
Publicado: (2025)

JourneyBench: A Challenging One-Stop Vision-Language Understanding Benchmark of Generated Images
por: Wang, Zhecan, et al.
Publicado: (2024)

AssemblyBench: Physics-Aware Assembly of Complex Industrial Objects
por: Li, Danrui, et al.
Publicado: (2026)

HY3D-Bench: Generation of 3D Assets
por: Hunyuan3D, Team, et al.
Publicado: (2026)

A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
por: Zhang, Zicheng, et al.
Publicado: (2024)

TurtleBench: A Visual Programming Benchmark in Turtle Geometry
por: Rismanchian, Sina, et al.
Publicado: (2024)

μ-Bench: A Vision-Language Benchmark for Microscopy Understanding
por: Lozano, Alejandro, et al.
Publicado: (2024)

ViLCo-Bench: VIdeo Language COntinual learning Benchmark
por: Tang, Tianqi, et al.
Publicado: (2024)

LocateBench: Evaluating the Locating Ability of Vision Language Models
por: Chiang, Ting-Rui, et al.
Publicado: (2024)

VideoGameBench: Can Vision-Language Models complete popular video games?
por: Zhang, Alex L., et al.
Publicado: (2025)

VLRS-Bench: A Vision-Language Reasoning Benchmark for Remote Sensing
por: Luo, Zhiming, et al.
Publicado: (2026)

Omni IIE Bench: Benchmarking the Practical Capabilities of Image Editing Models
por: Yang, Yujia, et al.
Publicado: (2026)

CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography
por: Zhu, Qingqing, et al.
Publicado: (2026)

SeqBench: Benchmarking Sequential Narrative Generation in Text-to-Video Models
por: Tang, Zhengxu, et al.
Publicado: (2025)

Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments
por: Ali, Muhammad, et al.
Publicado: (2025)

EchoBench: Benchmarking Sycophancy in Medical Large Vision-Language Models
por: Yuan, Botai, et al.
Publicado: (2025)

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding
por: Lin, Junming, et al.
Publicado: (2024)

SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning
por: Kong, Fanqi, et al.
Publicado: (2025)

Hydra-Bench: A Benchmark for Multi-Modal Leaf Wetness Sensing
por: Liu, Yimeng, et al.
Publicado: (2025)

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
por: Chowdhury, Sanjoy, et al.
Publicado: (2025)

WorldModelBench: Judging Video Generation Models As World Models
por: Li, Dacheng, et al.
Publicado: (2025)

PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions
por: Ma, Sihan, et al.
Publicado: (2024)

SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs
por: Zhang, Yuyou, et al.
Publicado: (2025)

GlazyBench: A Benchmark for Ceramic Glaze Property Prediction and Image Generation
por: Zhai, Ziyu, et al.
Publicado: (2026)

VisBrowse-Bench: Benchmarking Visual-Native Search for Multimodal Browsing Agents
por: Zhang, Zhengbo, et al.
Publicado: (2026)

VT-Bench: A Unified Benchmark for Visual-Tabular Multi-Modal Learning
por: Jia, Zi-Yi, et al.
Publicado: (2026)

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation
por: Wang, Jun, et al.
Publicado: (2026)

MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence
por: Chen, Yifan, et al.
Publicado: (2026)

EgoPro-Bench: Benchmarking Personalized Proactive Interaction in Egocentric Video Streams
por: Ran, Dongchuan, et al.
Publicado: (2026)

VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models
por: Saxena, Rohit, et al.
Publicado: (2026)

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models
por: Wang, JiYang, et al.
Publicado: (2026)

SurgBench: A Unified Large-Scale Benchmark for Surgical Video Analysis
por: Wei, Jianhui, et al.
Publicado: (2025)

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding
por: Zhang, Zhihong, et al.
Publicado: (2025)

GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
por: Simumba, Naomi, et al.
Publicado: (2025)

RefBench-PRO: Perceptual and Reasoning Oriented Benchmark for Referring Expression Comprehension
por: Gao, Tianyi, et al.
Publicado: (2025)