Guardado en:
| Autores principales: | Rajput, Krishna Singh, Anvekar, Tejas, Baral, Chitta, Gupta, Vivek |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2505.20816 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
ViTaB-A: Evaluating Multimodal Large Language Models on Visual Table Attribution
por: Alqurnawi, Yahia, et al.
Publicado: (2026)
por: Alqurnawi, Yahia, et al.
Publicado: (2026)
The Perceptual Observatory Characterizing Robustness and Grounding in MLLMs
por: Anvekar, Tejas, et al.
Publicado: (2025)
por: Anvekar, Tejas, et al.
Publicado: (2025)
GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning
por: Siingh, Shikhhar, et al.
Publicado: (2025)
por: Siingh, Shikhhar, et al.
Publicado: (2025)
TabReX : Tabular Referenceless eXplainable Evaluation
por: Anvekar, Tejas, et al.
Publicado: (2025)
por: Anvekar, Tejas, et al.
Publicado: (2025)
TraceBack: Multi-Agent Decomposition for Fine-Grained Table Attribution
por: Anvekar, Tejas, et al.
Publicado: (2026)
por: Anvekar, Tejas, et al.
Publicado: (2026)
TabXEval: Why this is a Bad Table? An eXhaustive Rubric for Table Evaluation
por: Pancholi, Vihang, et al.
Publicado: (2025)
por: Pancholi, Vihang, et al.
Publicado: (2025)
Is Architectural Complexity Overrated? Competitive and Interpretable Knowledge Graph Completion with RelatE
por: Chakraborty, Abhijit, et al.
Publicado: (2025)
por: Chakraborty, Abhijit, et al.
Publicado: (2025)
Map&Make: Schema Guided Text to Table Generation
por: Ahuja, Naman, et al.
Publicado: (2025)
por: Ahuja, Naman, et al.
Publicado: (2025)
Integrity Shield A System for Ethical AI Use & Authorship Transparency in Assessments
por: Shekhar, Ashish Raj, et al.
Publicado: (2026)
por: Shekhar, Ashish Raj, et al.
Publicado: (2026)
DoPE: Decoy Oriented Perturbation Encapsulation Human-Readable, AI-Hostile Documents for Academic Integrity
por: Shekhar, Ashish Raj, et al.
Publicado: (2026)
por: Shekhar, Ashish Raj, et al.
Publicado: (2026)
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
por: Patel, Maitreya, et al.
Publicado: (2023)
por: Patel, Maitreya, et al.
Publicado: (2023)
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts
por: Singh, Shubhankar, et al.
Publicado: (2024)
por: Singh, Shubhankar, et al.
Publicado: (2024)
Answering Questions in Stages: Prompt Chaining for Contract QA
por: Roegiest, Adam, et al.
Publicado: (2024)
por: Roegiest, Adam, et al.
Publicado: (2024)
UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization
por: Uddin, Md Nayem, et al.
Publicado: (2024)
por: Uddin, Md Nayem, et al.
Publicado: (2024)
MMTABREAL: Real-World Benchmark for Multimodal Table Understanding
por: Titiya, Prasham, et al.
Publicado: (2025)
por: Titiya, Prasham, et al.
Publicado: (2025)
MapIQ: Evaluating Multimodal Large Language Models for Map Question Answering
por: Srivastava, Varun, et al.
Publicado: (2025)
por: Srivastava, Varun, et al.
Publicado: (2025)
Enhancing Question Answering on Charts Through Effective Pre-training Tasks
por: Gupta, Ashim, et al.
Publicado: (2024)
por: Gupta, Ashim, et al.
Publicado: (2024)
TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
por: Patel, Maitreya, et al.
Publicado: (2024)
por: Patel, Maitreya, et al.
Publicado: (2024)
Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
por: Srivastava, Pragya, et al.
Publicado: (2024)
por: Srivastava, Pragya, et al.
Publicado: (2024)
$λ$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
por: Patel, Maitreya, et al.
Publicado: (2024)
por: Patel, Maitreya, et al.
Publicado: (2024)
MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting
por: Anand, Avinash, et al.
Publicado: (2024)
por: Anand, Avinash, et al.
Publicado: (2024)
Polymath: A Challenging Multi-modal Mathematical Reasoning Benchmark
por: Gupta, Himanshu, et al.
Publicado: (2024)
por: Gupta, Himanshu, et al.
Publicado: (2024)
From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents
por: Uddin, Md Nayem, et al.
Publicado: (2026)
por: Uddin, Md Nayem, et al.
Publicado: (2026)
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness
por: Varshney, Neeraj, et al.
Publicado: (2023)
por: Varshney, Neeraj, et al.
Publicado: (2023)
Efficient Multimodal Planning Agent for Visual Question-Answering
por: Chen, Zhuo, et al.
Publicado: (2026)
por: Chen, Zhuo, et al.
Publicado: (2026)
Cutting Through the Noise: Boosting LLM Performance on Math Word Problems
por: Anantheswaran, Ujjwala, et al.
Publicado: (2024)
por: Anantheswaran, Ujjwala, et al.
Publicado: (2024)
VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
por: Yilmaz, Nilay, et al.
Publicado: (2025)
por: Yilmaz, Nilay, et al.
Publicado: (2025)
SCOPE:Planning for Hybrid Querying over Clinical Trial Data
por: Chowdhury, Suparno Roy, et al.
Publicado: (2026)
por: Chowdhury, Suparno Roy, et al.
Publicado: (2026)
FD-NL2SQL: Feedback-Driven Clinical NL2SQL that Improves with Use
por: Chowdhury, Suparno Roy, et al.
Publicado: (2026)
por: Chowdhury, Suparno Roy, et al.
Publicado: (2026)
Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents
por: Kumbhar, Shrinidhi, et al.
Publicado: (2025)
por: Kumbhar, Shrinidhi, et al.
Publicado: (2025)
Investigating VLM Hallucination from a Cognitive Psychology Perspective: A First Step Toward Interpretation with Intriguing Observations
por: Liu, Xiangrui, et al.
Publicado: (2025)
por: Liu, Xiangrui, et al.
Publicado: (2025)
FIND: Toward Multimodal Financial Reasoning and Question Answering for Indic Languages
por: Das, Sarmistha, et al.
Publicado: (2026)
por: Das, Sarmistha, et al.
Publicado: (2026)
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
por: Saeidi, Amir, et al.
Publicado: (2024)
por: Saeidi, Amir, et al.
Publicado: (2024)
DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards
por: Kartha, Aaryaman, et al.
Publicado: (2025)
por: Kartha, Aaryaman, et al.
Publicado: (2025)
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations
por: Anvekar, Tejas, et al.
Publicado: (2024)
por: Anvekar, Tejas, et al.
Publicado: (2024)
Leveraging Synthetic Data for Question Answering with Multilingual LLMs in the Agricultural Domain
por: Kaur, Rishemjit, et al.
Publicado: (2025)
por: Kaur, Rishemjit, et al.
Publicado: (2025)
Instant Answering in E-Commerce Buyer-Seller Messaging using Message-to-Question Reformulation
por: Fetahu, Besnik, et al.
Publicado: (2024)
por: Fetahu, Besnik, et al.
Publicado: (2024)
Triple Preference Optimization: Achieving Better Alignment using a Single Step Optimization
por: Saeidi, Amir, et al.
Publicado: (2024)
por: Saeidi, Amir, et al.
Publicado: (2024)
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
por: Chatterjee, Agneet, et al.
Publicado: (2024)
por: Chatterjee, Agneet, et al.
Publicado: (2024)
VLMT: Vision-Language Multimodal Transformer for Multimodal Multi-hop Question Answering
por: Lim, Qi Zhi, et al.
Publicado: (2025)
por: Lim, Qi Zhi, et al.
Publicado: (2025)
Ejemplares similares
-
ViTaB-A: Evaluating Multimodal Large Language Models on Visual Table Attribution
por: Alqurnawi, Yahia, et al.
Publicado: (2026) -
The Perceptual Observatory Characterizing Robustness and Grounding in MLLMs
por: Anvekar, Tejas, et al.
Publicado: (2025) -
GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning
por: Siingh, Shikhhar, et al.
Publicado: (2025) -
TabReX : Tabular Referenceless eXplainable Evaluation
por: Anvekar, Tejas, et al.
Publicado: (2025) -
TraceBack: Multi-Agent Decomposition for Fine-Grained Table Attribution
por: Anvekar, Tejas, et al.
Publicado: (2026)