Saved in:
| Main Authors: | Ali, Abid, Molla-Aliod, Diego, Naseem, Usman |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.11753 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Measuring What Matters Beyond Text: Evaluating Multimodal Summaries by Quality, Alignment, and Diversity
by: Ali, Abid, et al.
Published: (2026)
by: Ali, Abid, et al.
Published: (2026)
Synthetic Dialogue Dataset Generation using LLM Agents
by: Abdullin, Yelaman, et al.
Published: (2024)
by: Abdullin, Yelaman, et al.
Published: (2024)
How Can Multimodal Remote Sensing Datasets Transform Classification via SpatialNet-ViT?
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
by: Kashyap, Gautam Siddharth, et al.
Published: (2025)
Steering Towards Fairness: Mitigating Political Bias in LLMs
by: Nadeem, Afrozah, et al.
Published: (2025)
by: Nadeem, Afrozah, et al.
Published: (2025)
Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering
by: Alawwad, Hessa, et al.
Published: (2025)
by: Alawwad, Hessa, et al.
Published: (2025)
Medical Question Summarization with Entity-driven Contrastive Learning
by: Lu, Wenpeng, et al.
Published: (2023)
by: Lu, Wenpeng, et al.
Published: (2023)
LLM Ensemble for RAG: Role of Context Length in Zero-Shot Question Answering for BioASQ Challenge
by: Galat, Dima, et al.
Published: (2025)
by: Galat, Dima, et al.
Published: (2025)
Towards Unified Multimodal Financial Forecasting: Integrating Sentiment Embeddings and Market Indicators via Cross-Modal Attention
by: Khanna, Sarthak, et al.
Published: (2025)
by: Khanna, Sarthak, et al.
Published: (2025)
Agentic Moderation: Multi-Agent Design for Safer Vision-Language Models
by: Ren, Juan, et al.
Published: (2025)
by: Ren, Juan, et al.
Published: (2025)
Multimodal Generative AI with Autoregressive LLMs for Human Motion Understanding and Generation: A Way Forward
by: Islam, Muhammad, et al.
Published: (2025)
by: Islam, Muhammad, et al.
Published: (2025)
Evaluating Multimodal Large Language Models on Educational Textbook Question Answering
by: Alawwad, Hessa A., et al.
Published: (2025)
by: Alawwad, Hessa A., et al.
Published: (2025)
Fairness Evaluation and Inference Level Mitigation in LLMs
by: Nadeem, Afrozah, et al.
Published: (2025)
by: Nadeem, Afrozah, et al.
Published: (2025)
Framing Political Bias in Multilingual LLMs Across Pakistani Languages
by: Nadeem, Afrozah, et al.
Published: (2025)
by: Nadeem, Afrozah, et al.
Published: (2025)
Modality Selection and Skill Segmentation via Cross-Modality Attention
by: Jiang, Jiawei, et al.
Published: (2025)
by: Jiang, Jiawei, et al.
Published: (2025)
2D_3D Feature Fusion via Cross-Modal Latent Synthesis and Attention Guided Restoration for Industrial Anomaly Detection
by: Ali, Usman, et al.
Published: (2025)
by: Ali, Usman, et al.
Published: (2025)
GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models
by: Liao, Haicheng, et al.
Published: (2023)
by: Liao, Haicheng, et al.
Published: (2023)
GIA-MIC: Multimodal Emotion Recognition with Gated Interactive Attention and Modality-Invariant Learning Constraints
by: He, Jiajun, et al.
Published: (2025)
by: He, Jiajun, et al.
Published: (2025)
Evaluating Hierarchical Clinical Document Classification Using Reasoning-Based LLMs
by: Mustafa, Akram, et al.
Published: (2025)
by: Mustafa, Akram, et al.
Published: (2025)
Can Reasoning LLMs Enhance Clinical Document Classification?
by: Mustafa, Akram, et al.
Published: (2025)
by: Mustafa, Akram, et al.
Published: (2025)
Cross-Modal Attention Network with Dual Graph Learning in Multimodal Recommendation
by: Dai, Ji, et al.
Published: (2026)
by: Dai, Ji, et al.
Published: (2026)
Enhancing textual textbook question answering with large language models and retrieval augmented generation
by: Alawwad, Hessa Abdulrahman, et al.
Published: (2024)
by: Alawwad, Hessa Abdulrahman, et al.
Published: (2024)
Enhancing Multimodal Sentiment Analysis for Missing Modality through Self-Distillation and Unified Modality Cross-Attention
by: Weng, Yuzhe, et al.
Published: (2024)
by: Weng, Yuzhe, et al.
Published: (2024)
Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation
by: Wang, Jiaxi, et al.
Published: (2023)
by: Wang, Jiaxi, et al.
Published: (2023)
Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs
by: Nadeem, Afrozah, et al.
Published: (2026)
by: Nadeem, Afrozah, et al.
Published: (2026)
MRGAgents: A Multi-Agent Framework for Improved Medical Report Generation with Med-LVLMs
by: Wang, Pengyu, et al.
Published: (2025)
by: Wang, Pengyu, et al.
Published: (2025)
MRG-R1: Reinforcement Learning for Clinically Aligned Medical Report Generation
by: Wang, Pengyu, et al.
Published: (2025)
by: Wang, Pengyu, et al.
Published: (2025)
CogniAlign: Word-Level Multimodal Speech Alignment with Gated Cross-Attention for Alzheimer's Detection
by: Ortiz-Perez, David, et al.
Published: (2025)
by: Ortiz-Perez, David, et al.
Published: (2025)
Flick: Few Labels Text Classification using K-Aware Intermediate Learning in Multi-Task Low-Resource Languages
by: Almutairi, Ali, et al.
Published: (2025)
by: Almutairi, Ali, et al.
Published: (2025)
Do Personality Traits Interfere? Geometric Limitations of Steering in Large Language Models
by: Bhandari, Pranav, et al.
Published: (2026)
by: Bhandari, Pranav, et al.
Published: (2026)
Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification
by: Chen, Shijing, et al.
Published: (2025)
by: Chen, Shijing, et al.
Published: (2025)
Preserving Cross-Modal Stability for Visual Unlearning in Multimodal Scenarios
by: Li, Jinghan Xu Yuyang Zhang Qixuan Cai Jiancheng Chen Keqiu
Published: (2025)
by: Li, Jinghan Xu Yuyang Zhang Qixuan Cai Jiancheng Chen Keqiu
Published: (2025)
Multimodal Contrastive Learning via Uni-Modal Coding and Cross-Modal Prediction for Multimodal Sentiment Analysis
by: Lin, Ronghao, et al.
Published: (2022)
by: Lin, Ronghao, et al.
Published: (2022)
Competing LLM Agents in a Non-Cooperative Game of Opinion Polarisation
by: Qasmi, Amin, et al.
Published: (2025)
by: Qasmi, Amin, et al.
Published: (2025)
ReflectDiffu:Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework
by: Yuan, Jiahao, et al.
Published: (2024)
by: Yuan, Jiahao, et al.
Published: (2024)
Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation
by: Ayllón, Elena Mulero, et al.
Published: (2025)
by: Ayllón, Elena Mulero, et al.
Published: (2025)
Stock Movement Prediction with Multimodal Stable Fusion via Gated Cross-Attention Mechanism
by: Zong, Chang, et al.
Published: (2024)
by: Zong, Chang, et al.
Published: (2024)
Alignment-Aware and Reliability-Gated Multimodal Fusion for Unmanned Aerial Vehicle Detection Across Heterogeneous Thermal-Visual Sensors
by: Jahan, Ishrat, et al.
Published: (2026)
by: Jahan, Ishrat, et al.
Published: (2026)
COSEE: Consistency-Oriented Signal-Based Early Exiting via Calibrated Sample Weighting Mechanism
by: He, Jianing, et al.
Published: (2024)
by: He, Jianing, et al.
Published: (2024)
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare
by: Shetty, Anudeex, et al.
Published: (2025)
by: Shetty, Anudeex, et al.
Published: (2025)
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection
by: Aggarwal, Sajal, et al.
Published: (2024)
by: Aggarwal, Sajal, et al.
Published: (2024)
Similar Items
-
Measuring What Matters Beyond Text: Evaluating Multimodal Summaries by Quality, Alignment, and Diversity
by: Ali, Abid, et al.
Published: (2026) -
Synthetic Dialogue Dataset Generation using LLM Agents
by: Abdullin, Yelaman, et al.
Published: (2024) -
How Can Multimodal Remote Sensing Datasets Transform Classification via SpatialNet-ViT?
by: Kashyap, Gautam Siddharth, et al.
Published: (2025) -
Steering Towards Fairness: Mitigating Political Bias in LLMs
by: Nadeem, Afrozah, et al.
Published: (2025) -
Beyond Retrieval: Joint Supervision and Multimodal Document Ranking for Textbook Question Answering
by: Alawwad, Hessa, et al.
Published: (2025)