Saved in:
| Main Authors: | Chen, Yuhan, Su, Lumei, Chen, Lihua, Lin, Zhiwei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.15842 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Location-Aware Pretraining for Medical Difference Visual Question Answering
by: Musinguzi, Denis, et al.
Published: (2026)
by: Musinguzi, Denis, et al.
Published: (2026)
Free Form Medical Visual Question Answering in Radiology
by: Narayanan, Abhishek, et al.
Published: (2024)
by: Narayanan, Abhishek, et al.
Published: (2024)
CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
by: Parmar, Paritosh, et al.
Published: (2024)
by: Parmar, Paritosh, et al.
Published: (2024)
Object Attribute Matters in Visual Question Answering
by: Li, Peize, et al.
Published: (2023)
by: Li, Peize, et al.
Published: (2023)
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
by: Chen, Pingyi, et al.
Published: (2024)
by: Chen, Pingyi, et al.
Published: (2024)
VQA$^2$: Visual Question Answering for Video Quality Assessment
by: Jia, Ziheng, et al.
Published: (2024)
by: Jia, Ziheng, et al.
Published: (2024)
Can I Trust Your Answer? Visually Grounded Video Question Answering
by: Xiao, Junbin, et al.
Published: (2023)
by: Xiao, Junbin, et al.
Published: (2023)
WikiVQABench: A Knowledge-Grounded Visual Question Answering Benchmark from Wikipedia and Wikidata
by: Shbita, Basel, et al.
Published: (2026)
by: Shbita, Basel, et al.
Published: (2026)
A Knowledge Noise Mitigation Framework for Knowledge-based Visual Question Answering
by: Liu, Zhiyue, et al.
Published: (2025)
by: Liu, Zhiyue, et al.
Published: (2025)
VoQA: Visual-only Question Answering
by: An, Jianing, et al.
Published: (2025)
by: An, Jianing, et al.
Published: (2025)
Knowledge Detection by Relevant Question and Image Attributes in Visual Question Answering
by: Ahir, Param, et al.
Published: (2023)
by: Ahir, Param, et al.
Published: (2023)
LingoQA: Visual Question Answering for Autonomous Driving
by: Marcu, Ana-Maria, et al.
Published: (2023)
by: Marcu, Ana-Maria, et al.
Published: (2023)
Ego-Grounding for Personalized Question-Answering in Egocentric Videos
by: Xiao, Junbin, et al.
Published: (2026)
by: Xiao, Junbin, et al.
Published: (2026)
SciEGQA: A Dataset for Scientific Evidence-Grounded Question Answering and Reasoning
by: Yu, Wenhan, et al.
Published: (2025)
by: Yu, Wenhan, et al.
Published: (2025)
MaS-VQA: A Mask-and-Select Framework for Knowledge-Based Visual Question Answering
by: Mao, Xianwei, et al.
Published: (2026)
by: Mao, Xianwei, et al.
Published: (2026)
CLARIFY: A Specialist-Generalist Framework for Accurate and Lightweight Dermatological Visual Question Answering
by: Saha, Aranya, et al.
Published: (2025)
by: Saha, Aranya, et al.
Published: (2025)
Cause-Effect Driven Optimization for Robust Medical Visual Question Answering with Language Biases
by: Zhu, Huanjia, et al.
Published: (2025)
by: Zhu, Huanjia, et al.
Published: (2025)
Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering
by: Fu, Xingyu, et al.
Published: (2023)
by: Fu, Xingyu, et al.
Published: (2023)
Saliency Guided Longitudinal Medical Visual Question Answering
by: Wu, Jialin, et al.
Published: (2025)
by: Wu, Jialin, et al.
Published: (2025)
Multi-Sourced Compositional Generalization in Visual Question Answering
by: Li, Chuanhao, et al.
Published: (2025)
by: Li, Chuanhao, et al.
Published: (2025)
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts
by: Özdemir, Övgü, et al.
Published: (2024)
by: Özdemir, Övgü, et al.
Published: (2024)
TM-PATHVQA:90000+ Textless Multilingual Questions for Medical Visual Question Answering
by: Rajkhowa, Tonmoy, et al.
Published: (2024)
by: Rajkhowa, Tonmoy, et al.
Published: (2024)
Navigating the Mirage: A Dual-Path Agentic Framework for Robust Misleading Chart Question Answering
by: Zhang, Yanjie, et al.
Published: (2026)
by: Zhang, Yanjie, et al.
Published: (2026)
Are Large Vision Language Models Truly Grounded in Medical Images? Evidence from Italian Clinical Visual Question Answering
by: Felizzi, Federico, et al.
Published: (2025)
by: Felizzi, Federico, et al.
Published: (2025)
Parameter-Efficient VLMs for Gastrointestinal Endoscopy: Medical Image Generation and Clinical Visual Question Answering
by: Peter, Ojonugwa Oluwafemi Ejiga, et al.
Published: (2026)
by: Peter, Ojonugwa Oluwafemi Ejiga, et al.
Published: (2026)
Spatially Grounded Explanations in Vision Language Models for Document Visual Question Answering
by: Lagos, Maximiliano Hormazábal, et al.
Published: (2025)
by: Lagos, Maximiliano Hormazábal, et al.
Published: (2025)
Hallucination Benchmark in Medical Visual Question Answering
by: Wu, Jinge, et al.
Published: (2024)
by: Wu, Jinge, et al.
Published: (2024)
FaithSCAN: Model-Driven Single-Pass Hallucination Detection for Faithful Visual Question Answering
by: Tong, Chaodong, et al.
Published: (2026)
by: Tong, Chaodong, et al.
Published: (2026)
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
by: Chen, Jiawei, et al.
Published: (2024)
by: Chen, Jiawei, et al.
Published: (2024)
Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering
by: Marouf, Imad Eddine, et al.
Published: (2025)
by: Marouf, Imad Eddine, et al.
Published: (2025)
MUPA: Towards Multi-Path Agentic Reasoning for Grounded Video Question Answering
by: Dang, Jisheng, et al.
Published: (2025)
by: Dang, Jisheng, et al.
Published: (2025)
Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning
by: Liu, Huabin, et al.
Published: (2025)
by: Liu, Huabin, et al.
Published: (2025)
MedLVR: Latent Visual Reasoning for Reliable Medical Visual Question Answering
by: Xi, Suyang, et al.
Published: (2026)
by: Xi, Suyang, et al.
Published: (2026)
Variational Visual Question Answering for Uncertainty-Aware Selective Prediction
by: Wieczorek, Tobias Jan, et al.
Published: (2025)
by: Wieczorek, Tobias Jan, et al.
Published: (2025)
Fine-Grained Knowledge Structuring and Retrieval for Visual Question Answering
by: Zhang, Zhengxuan, et al.
Published: (2025)
by: Zhang, Zhengxuan, et al.
Published: (2025)
Wasserstein Equilibrium Decoding for Reliable Medical Visual Question Answering
by: Hagen, Luca, et al.
Published: (2026)
by: Hagen, Luca, et al.
Published: (2026)
Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat
by: Xu, Pusheng, et al.
Published: (2025)
by: Xu, Pusheng, et al.
Published: (2025)
Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning
by: Zeng, Xingchen, et al.
Published: (2024)
by: Zeng, Xingchen, et al.
Published: (2024)
Warehouse Spatial Question Answering with LLM Agent
by: Huang, Hsiang-Wei, et al.
Published: (2025)
by: Huang, Hsiang-Wei, et al.
Published: (2025)
Visual Question Answering in Ophthalmology: A Progressive and Practical Perspective
by: Chen, Xiaolan, et al.
Published: (2024)
by: Chen, Xiaolan, et al.
Published: (2024)
Similar Items
-
Location-Aware Pretraining for Medical Difference Visual Question Answering
by: Musinguzi, Denis, et al.
Published: (2026) -
Free Form Medical Visual Question Answering in Radiology
by: Narayanan, Abhishek, et al.
Published: (2024) -
CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
by: Parmar, Paritosh, et al.
Published: (2024) -
Object Attribute Matters in Visual Question Answering
by: Li, Peize, et al.
Published: (2023) -
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering
by: Chen, Pingyi, et al.
Published: (2024)