Saved in:
| Main Authors: | Ji, Zhengyang, Gao, Shang, Liu, Li, Jia, Yifan, Yue, Yutao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.02476 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Adaptive H&E-IHC information fusion staining framework based on feature extra
by: Jia, Yifan, et al.
Published: (2025)
by: Jia, Yifan, et al.
Published: (2025)
DRIVE: Dual-Robustness via Information Variability and Entropic Consistency in Source-Free Unsupervised Domain Adaptation
by: Xiao, Ruiqiang, et al.
Published: (2024)
by: Xiao, Ruiqiang, et al.
Published: (2024)
Dual Consistent Constraint via Disentangled Consistency and Complementarity for Multi-view Clustering
by: Li, Bo, et al.
Published: (2025)
by: Li, Bo, et al.
Published: (2025)
Improving VQA Reliability: A Dual-Assessment Approach with Self-Reflection and Cross-Model Verification
by: Wu, Xixian, et al.
Published: (2025)
by: Wu, Xixian, et al.
Published: (2025)
CHRIS: Clothed Human Reconstruction with Side View Consistency
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
Harnessing Group-Oriented Consistency Constraints for Semi-Supervised Semantic Segmentation in CdZnTe Semiconductors
by: Li, Peihao, et al.
Published: (2025)
by: Li, Peihao, et al.
Published: (2025)
Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation
by: Ji, Zhengyang, et al.
Published: (2025)
by: Ji, Zhengyang, et al.
Published: (2025)
CoEmoGen: Towards Semantically-Coherent and Scalable Emotional Image Content Generation
by: Yuan, Kaishen, et al.
Published: (2025)
by: Yuan, Kaishen, et al.
Published: (2025)
ARIAL: An Agentic Framework for Document VQA with Precise Answer Localization
by: Mohammadshirazi, Ahmad, et al.
Published: (2025)
by: Mohammadshirazi, Ahmad, et al.
Published: (2025)
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
by: Chen, Jiawei, et al.
Published: (2024)
by: Chen, Jiawei, et al.
Published: (2024)
VQA$^2$: Visual Question Answering for Video Quality Assessment
by: Jia, Ziheng, et al.
Published: (2024)
by: Jia, Ziheng, et al.
Published: (2024)
Dual Causal Inference: Integrating Backdoor Adjustment and Instrumental Variable Learning for Medical VQA
by: Xu, Zibo, et al.
Published: (2026)
by: Xu, Zibo, et al.
Published: (2026)
When to Trust the Answer: Question-Aligned Semantic Nearest Neighbor Entropy for Safer Surgical VQA
by: Carlini, Luca, et al.
Published: (2025)
by: Carlini, Luca, et al.
Published: (2025)
TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval
by: Lyu, Shuai, et al.
Published: (2025)
by: Lyu, Shuai, et al.
Published: (2025)
Biomed-DPT: Dual Modality Prompt Tuning for Biomedical Vision-Language Models
by: Peng, Wei, et al.
Published: (2025)
by: Peng, Wei, et al.
Published: (2025)
An Evaluation of GPT-4V and Gemini in Online VQA
by: Liu, Mengchen, et al.
Published: (2023)
by: Liu, Mengchen, et al.
Published: (2023)
Semantic and Visual Evidence for Efficient Long-Video Reasoning: A Solution for the HD-EPIC VQA Challenge
by: Xu, Yinsong, et al.
Published: (2026)
by: Xu, Yinsong, et al.
Published: (2026)
LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models
by: Ge, Qihang, et al.
Published: (2024)
by: Ge, Qihang, et al.
Published: (2024)
Diffusion-Guided Semantic Consistency for Multimodal Heterogeneity
by: Liu, Jing, et al.
Published: (2026)
by: Liu, Jing, et al.
Published: (2026)
BioVFM-21M: Benchmarking and Scaling Self-Supervised Vision Foundation Models for Biomedical Image Analysis
by: Liu, Jiarun, et al.
Published: (2025)
by: Liu, Jiarun, et al.
Published: (2025)
VQA-Levels: A Hierarchical Approach for Classifying Questions in VQA
by: Madaka, Madhuri Latha, et al.
Published: (2025)
by: Madaka, Madhuri Latha, et al.
Published: (2025)
Look-Closer-Then-Diagnose: Confidence-Aware Ultrasound VQA via Active Zooming
by: Zhou, Yue, et al.
Published: (2026)
by: Zhou, Yue, et al.
Published: (2026)
Spectral Discrepancy and Cross-modal Semantic Consistency Learning for Object Detection in Hyperspectral Image
by: He, Xiao, et al.
Published: (2025)
by: He, Xiao, et al.
Published: (2025)
MaS-VQA: A Mask-and-Select Framework for Knowledge-Based Visual Question Answering
by: Mao, Xianwei, et al.
Published: (2026)
by: Mao, Xianwei, et al.
Published: (2026)
PathNavigate: A Training-Free Pathology Agent with Surprise-Guided Scan and Shared Slide Memory for Whole-Slide Image VQA
by: Yang, Chunze, et al.
Published: (2026)
by: Yang, Chunze, et al.
Published: (2026)
Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event Localization
by: Jiang, Yuanyuan, et al.
Published: (2022)
by: Jiang, Yuanyuan, et al.
Published: (2022)
Improving the Classification Effect of Clinical Images of Diseases for Multi-Source Privacy Protection
by: Bowen, Tian, et al.
Published: (2024)
by: Bowen, Tian, et al.
Published: (2024)
SCRA-VQA: Summarized Caption-Rerank for Augmented Large Language Models in Visual Question Answering
by: Zhang, Yan, et al.
Published: (2025)
by: Zhang, Yan, et al.
Published: (2025)
UniBEVFusion: Unified Radar-Vision BEVFusion for 3D Object Detection
by: Zhao, Haocheng, et al.
Published: (2024)
by: Zhao, Haocheng, et al.
Published: (2024)
Fence Theorem: Towards Dual-Objective Semantic-Structure Isolation in Preprocessing Phase for 3D Anomaly Detection
by: Liang, Hanzhe, et al.
Published: (2025)
by: Liang, Hanzhe, et al.
Published: (2025)
Cycle Inverse-Consistent TransMorph: A Balanced Deep Learning Framework for Brain MRI Registration
by: Shang, Jiaqi, et al.
Published: (2026)
by: Shang, Jiaqi, et al.
Published: (2026)
Is ChatGPT-5 Ready for Mammogram VQA?
by: Li, Qiang, et al.
Published: (2025)
by: Li, Qiang, et al.
Published: (2025)
Advancing Surgical VQA with Scene Graph Knowledge
by: Yuan, Kun, et al.
Published: (2023)
by: Yuan, Kun, et al.
Published: (2023)
Towards Clinically Interpretable Ophthalmic VQA via Spatially-Grounded Lesion Evidence
by: Wang, Xingyue, et al.
Published: (2026)
by: Wang, Xingyue, et al.
Published: (2026)
EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion
by: Liu, Shang, et al.
Published: (2025)
by: Liu, Shang, et al.
Published: (2025)
PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting
by: Song, Yixiao, et al.
Published: (2026)
by: Song, Yixiao, et al.
Published: (2026)
Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane
by: Yan, Han, et al.
Published: (2024)
by: Yan, Han, et al.
Published: (2024)
R-LLaVA: Improving Med-VQA Understanding through Visual Region of Interest
by: Chen, Xupeng, et al.
Published: (2024)
by: Chen, Xupeng, et al.
Published: (2024)
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
by: Dai, Ming, et al.
Published: (2025)
by: Dai, Ming, et al.
Published: (2025)
KNVQA: A Benchmark for evaluation knowledge-based VQA
by: Cheng, Sirui, et al.
Published: (2023)
by: Cheng, Sirui, et al.
Published: (2023)
Similar Items
-
Adaptive H&E-IHC information fusion staining framework based on feature extra
by: Jia, Yifan, et al.
Published: (2025) -
DRIVE: Dual-Robustness via Information Variability and Entropic Consistency in Source-Free Unsupervised Domain Adaptation
by: Xiao, Ruiqiang, et al.
Published: (2024) -
Dual Consistent Constraint via Disentangled Consistency and Complementarity for Multi-view Clustering
by: Li, Bo, et al.
Published: (2025) -
Improving VQA Reliability: A Dual-Assessment Approach with Self-Reflection and Cross-Model Verification
by: Wu, Xixian, et al.
Published: (2025) -
CHRIS: Clothed Human Reconstruction with Side View Consistency
by: Liu, Dong, et al.
Published: (2025)