:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ghosh, Akash, Ashraf, Tajamul, Singh, Rishu Kumar, Saeed, Numan, Saha, Sriparna, Chen, Xiuying, Khan, Salman
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.24157
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

When Background Matters: Breaking Medical Vision Language Models by Transferable Attack
by: Ghosh, Akash, et al.
Published: (2026)

Talk, Snap, Complain: Validation-Aware Multimodal Expert Framework for Fine-Grained Customer Grievances
by: Singh, Rishu Kumar, et al.
Published: (2025)

CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning
by: Onyame, Eric, et al.
Published: (2026)

SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models' Knowledge of Indian Culture
by: Maji, Arijit, et al.
Published: (2025)

HF-Fed: Hierarchical based customized Federated Learning Framework for X-Ray Imaging
by: Ashraf, Tajamul, et al.
Published: (2024)

CLINIC: Evaluating Multilingual Trustworthiness in Language Models for Healthcare
by: Ghosh, Akash, et al.
Published: (2025)

A Survey of Multilingual Reasoning in Language Models
by: Ghosh, Akash, et al.
Published: (2025)

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning
by: Ashraf, Tajamul, et al.
Published: (2025)

SUMMIR: A Hallucination-Aware Framework for Ranking Sports Insights from LLMs
by: Kumar, Nitish, et al.
Published: (2026)

TITAN: Query-Token based Domain Adaptive Adversarial Learning
by: Ashraf, Tajamul, et al.
Published: (2025)

FATE: Focal-modulated Attention Encoder for Multivariate Time-series Forecasting
by: Ashraf, Tajamul, et al.
Published: (2024)

M3Retrieve: Benchmarking Multimodal Retrieval for Medicine
by: Acharya, Arkadeep, et al.
Published: (2025)

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions
by: Ghosh, Akash, et al.
Published: (2024)

Infogen: Generating Complex Statistical Infographics from Documents
by: Ghosh, Akash, et al.
Published: (2025)

A Survey on Medical Document Summarization: From Machine Learning Techniques to Large Language Models
by: Akash Ghosh, et al.
Published: (2025)

BhashaSutra: A Task-Centric Unified Survey of Indian NLP Datasets, Corpora, and Resources
by: Kumar, Raghvendra, et al.
Published: (2026)

Generalizable Federated Learning using Client Adaptive Focal Modulation
by: Ashraf, Tajamul, et al.
Published: (2025)

A Comprehensive Survey of Hallucination in Large Language, Image, Video and Audio Foundation Models
by: Sahoo, Pranab, et al.
Published: (2024)

Towards Knowledge-Infused Automated Disease Diagnosis Assistant
by: Tomar, Mohit, et al.
Published: (2024)

Relic: Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples
by: Ghosal, Soumya Suvra, et al.
Published: (2025)

Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks
by: Ashraf, Tajamul, et al.
Published: (2025)

VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning
by: Surana, Harshul Raj, et al.
Published: (2026)

Some factorization results for formal power series
by: Garg, Rishu, et al.
Published: (2025)

Some factorization results on polynomials having integer coefficients
by: Singh, Jitender, et al.
Published: (2023)

A generalized Dumas irreducibility criterion
by: Garg, Rishu, et al.
Published: (2025)

On irreducible factors of polynomials over integers
by: Garg, Rishu, et al.
Published: (2025)

An EcoSage Assistant: Towards Building A Multimodal Plant Care Dialogue Assistant
by: Tomar, Mohit, et al.
Published: (2024)

MIRA: A Novel Framework for Fusing Modalities in Medical RAG
by: Wang, Jinhong, et al.
Published: (2025)

DRISHTIKON: A Multimodal Multilingual Benchmark for Testing Language Models' Understanding on Indian Culture
by: Maji, Arijit, et al.
Published: (2025)

From Fragments to Facts: A Curriculum-Driven DPO Approach for Generating Hindi News Veracity Explanations
by: Bansal, Pulkit, et al.
Published: (2025)

On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models
by: Malik, Hashmat Shadab, et al.
Published: (2024)

LLM Post-Training: A Deep Dive into Reasoning Large Language Models
by: Kumar, Komal, et al.
Published: (2025)

BMW Agents -- A Framework For Task Automation Through Multi-Agent Collaboration
by: Crawford, Noel, et al.
Published: (2024)

Demystifying ChatGPT: How It Masters Genre Recognition
by: Raj, Subham, et al.
Published: (2025)

GTA: Generating Long-Horizon Tasks for Web Agents at Scale
by: Huang, Tenghao, et al.
Published: (2026)

OS-Marathon: Benchmarking Computer-Use Agents on Long-Horizon Repetitive Tasks
by: Wu, Jing, et al.
Published: (2026)

Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development
by: Sahoo, Pranab, et al.
Published: (2024)

Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty
by: Singh, Joykirat, et al.
Published: (2026)

Yes, this is what I was looking for! Towards Multi-modal Medical Consultation Concern Summary Generation
by: Tiwari, Abhisek, et al.
Published: (2024)

Searching for Dark Matter with MeVCube
by: Saha, Akash Kumar
Published: (2025)