Saved in:
| Main Authors: | Ballout, Mohamad, Jassim, Serwan, Bruni, Elia |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.16572 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
iVISPAR -- An Interactive Visual-Spatial Reasoning Benchmark for VLMs
by: Mayer, Julius, et al.
Published: (2025)
by: Mayer, Julius, et al.
Published: (2025)
GRASP: A novel benchmark for evaluating language GRounding And Situated Physics understanding in multimodal language models
by: Jassim, Serwan, et al.
Published: (2023)
by: Jassim, Serwan, et al.
Published: (2023)
Transformer Tafsir at QIAS 2025 Shared Task: Hybrid Retrieval-Augmented Generation for Islamic Knowledge Question Answering
by: Ahmad, Muhammad Abu, et al.
Published: (2025)
by: Ahmad, Muhammad Abu, et al.
Published: (2025)
From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
by: Ohmer, Xenia, et al.
Published: (2024)
by: Ohmer, Xenia, et al.
Published: (2024)
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
by: Ballout, Mohamad, et al.
Published: (2024)
by: Ballout, Mohamad, et al.
Published: (2024)
Can you SPLICE it together? A Human Curated Benchmark for Probing Visual Reasoning in VLMs
by: Ballout, Mohamad, et al.
Published: (2025)
by: Ballout, Mohamad, et al.
Published: (2025)
Show Me How It's Done: The Role of Explanations in Fine-Tuning Language Models
by: Ballout, Mohamad, et al.
Published: (2024)
by: Ballout, Mohamad, et al.
Published: (2024)
Interpretability of Language Models via Task Spaces
by: Weber, Lucas, et al.
Published: (2024)
by: Weber, Lucas, et al.
Published: (2024)
Enhancing SLM via ChatGPT and Dataset Augmentation
by: Pieper, Tom, et al.
Published: (2024)
by: Pieper, Tom, et al.
Published: (2024)
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
by: Tatariya, Kushal, et al.
Published: (2024)
by: Tatariya, Kushal, et al.
Published: (2024)
Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics
by: Ryan, Yuriel, et al.
Published: (2025)
by: Ryan, Yuriel, et al.
Published: (2025)
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs
by: Wang, Haochen, et al.
Published: (2025)
by: Wang, Haochen, et al.
Published: (2025)
Probing Language Models' Gesture Understanding for Enhanced Human-AI Interaction
by: Wicke, Philipp
Published: (2024)
by: Wicke, Philipp
Published: (2024)
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs
by: Sun, Kaiser, et al.
Published: (2026)
by: Sun, Kaiser, et al.
Published: (2026)
Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?
by: Li, Xiujun, et al.
Published: (2023)
by: Li, Xiujun, et al.
Published: (2023)
Multilingual Pretraining for Pixel Language Models
by: Kesen, Ilker, et al.
Published: (2025)
by: Kesen, Ilker, et al.
Published: (2025)
Do Multimodal Large Language Models Understand Welding?
by: Khvatskii, Grigorii, et al.
Published: (2025)
by: Khvatskii, Grigorii, et al.
Published: (2025)
PIXAR: Auto-Regressive Language Modeling in Pixel Space
by: Tai, Yintao, et al.
Published: (2024)
by: Tai, Yintao, et al.
Published: (2024)
From Reasoning to Pixels: Benchmarking the Alignment Gap in Unified Multimodal Models
by: Yang, Cheng, et al.
Published: (2026)
by: Yang, Cheng, et al.
Published: (2026)
Probing Multimodal Large Language Models for Global and Local Semantic Representations
by: Tao, Mingxu, et al.
Published: (2024)
by: Tao, Mingxu, et al.
Published: (2024)
Evaluating Multimodal Large Language Models on Spoken Sarcasm Understanding
by: Li, Zhu, et al.
Published: (2025)
by: Li, Zhu, et al.
Published: (2025)
Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation
by: Huang, Jen-tse, et al.
Published: (2026)
by: Huang, Jen-tse, et al.
Published: (2026)
MIND Your Reasoning: A Meta-Cognitive Intuitive-Reflective Network for Dual-Reasoning in Multimodal Stance Detection
by: Wang, Bingbing, et al.
Published: (2025)
by: Wang, Bingbing, et al.
Published: (2025)
Can Large Vision-Language Models Understand Multimodal Sarcasm?
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models
by: Hwang, Hyeontaek, et al.
Published: (2026)
by: Hwang, Hyeontaek, et al.
Published: (2026)
MIXAR: Scaling Autoregressive Pixel-based Language Models to Multiple Languages and Scripts
by: Hu, Chen, et al.
Published: (2026)
by: Hu, Chen, et al.
Published: (2026)
SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models
by: Hyun, Lee, et al.
Published: (2023)
by: Hyun, Lee, et al.
Published: (2023)
EmoVerse: Exploring Multimodal Large Language Models for Sentiment and Emotion Understanding
by: Li, Ao, et al.
Published: (2024)
by: Li, Ao, et al.
Published: (2024)
CUE-M: Contextual Understanding and Enhanced Search with Multimodal Large Language Model
by: Go, Dongyoung, et al.
Published: (2024)
by: Go, Dongyoung, et al.
Published: (2024)
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains
by: Paniv, Yurii, et al.
Published: (2024)
by: Paniv, Yurii, et al.
Published: (2024)
MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for Large Multimodal Models
by: Jiang, Kailin, et al.
Published: (2025)
by: Jiang, Kailin, et al.
Published: (2025)
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts
by: Ying, Jiahao, et al.
Published: (2023)
by: Ying, Jiahao, et al.
Published: (2023)
Understanding Moral Reasoning Trajectories in Large Language Models: Toward Probing-Based Explainability
by: Huang, Fan, et al.
Published: (2026)
by: Huang, Fan, et al.
Published: (2026)
Multimodal Fact-Checking with Vision Language Models: A Probing Classifier based Solution with Embedding Strategies
by: Cekinel, Recep Firat, et al.
Published: (2024)
by: Cekinel, Recep Firat, et al.
Published: (2024)
On Pre-training of Multimodal Language Models Customized for Chart Understanding
by: Fan, Wan-Cyuan, et al.
Published: (2024)
by: Fan, Wan-Cyuan, et al.
Published: (2024)
AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness
by: Chen, Zixin, et al.
Published: (2025)
by: Chen, Zixin, et al.
Published: (2025)
From Text to Pixel: Advancing Long-Context Understanding in MLLMs
by: Lu, Yujie, et al.
Published: (2024)
by: Lu, Yujie, et al.
Published: (2024)
Evaluating Pixel Language Models on Non-Standardized Languages
by: Muñoz-Ortiz, Alberto, et al.
Published: (2024)
by: Muñoz-Ortiz, Alberto, et al.
Published: (2024)
Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding
by: Li, Yun, et al.
Published: (2025)
by: Li, Yun, et al.
Published: (2025)
Tolerance Principle and Small Language Model Learning
by: Friedman, Adam E., et al.
Published: (2026)
by: Friedman, Adam E., et al.
Published: (2026)
Similar Items
-
iVISPAR -- An Interactive Visual-Spatial Reasoning Benchmark for VLMs
by: Mayer, Julius, et al.
Published: (2025) -
GRASP: A novel benchmark for evaluating language GRounding And Situated Physics understanding in multimodal language models
by: Jassim, Serwan, et al.
Published: (2023) -
Transformer Tafsir at QIAS 2025 Shared Task: Hybrid Retrieval-Augmented Generation for Islamic Knowledge Question Answering
by: Ahmad, Muhammad Abu, et al.
Published: (2025) -
From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
by: Ohmer, Xenia, et al.
Published: (2024) -
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
by: Ballout, Mohamad, et al.
Published: (2024)