Saved in:
| Main Authors: | Gautam, Sushant, Storås, Andrea, Midoglu, Cise, Hicks, Steven A., Thambawita, Vajira, Halvorsen, Pål, Riegler, Michael A. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.01437 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VideoHEDGE: Entropy-Based Hallucination Detection for Video-VLMs via Semantic Clustering and Spatiotemporal Perturbations
by: Gautam, Sushant, et al.
Published: (2026)
by: Gautam, Sushant, et al.
Published: (2026)
SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
Medico 2025: Visual Question Answering for Gastrointestinal Imaging
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study
by: Sepasdar, Zahra, et al.
Published: (2024)
by: Sepasdar, Zahra, et al.
Published: (2024)
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset
by: Gautam, Sushant, et al.
Published: (2024)
by: Gautam, Sushant, et al.
Published: (2024)
Point, Detect, Count: Multi-Task Medical Image Understanding with Instruction-Tuned Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
PLayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips
by: Solberg, Håkon Maric, et al.
Published: (2024)
by: Solberg, Håkon Maric, et al.
Published: (2024)
Multimodal Integration Challenges in Emotionally Expressive Child Avatars for Training Applications
by: Salehi, Pegah, et al.
Published: (2025)
by: Salehi, Pegah, et al.
Published: (2025)
SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries
by: Strand, Aleksander Theo, et al.
Published: (2024)
by: Strand, Aleksander Theo, et al.
Published: (2024)
Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG
by: Strand, Aleksander Theo, et al.
Published: (2024)
by: Strand, Aleksander Theo, et al.
Published: (2024)
ECG-IMN: Interpretable Mesomorphic Neural Networks for 12-Lead Electrocardiogram Interpretation
by: Thambawita, Vajira, et al.
Published: (2026)
by: Thambawita, Vajira, et al.
Published: (2026)
Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
by: Salehi, Pegah, et al.
Published: (2024)
by: Salehi, Pegah, et al.
Published: (2024)
Advancing sleep detection by modelling weak label sets: A novel weakly supervised learning approach
by: Boeker, Matthias, et al.
Published: (2024)
by: Boeker, Matthias, et al.
Published: (2024)
ExposureEngine: Oriented Logo Detection and Sponsor Visibility Analytics in Sports Broadcasts
by: Sarkhoosh, Mehdi Houshmand, et al.
Published: (2025)
by: Sarkhoosh, Mehdi Houshmand, et al.
Published: (2025)
Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models
by: Chaichuk, Mikhail, et al.
Published: (2025)
by: Chaichuk, Mikhail, et al.
Published: (2025)
Calliope: A TTS-based Narrated E-book Creator Ensuring Exact Synchronization, Privacy, and Layout Fidelity
by: Hammer, Hugo L., et al.
Published: (2026)
by: Hammer, Hugo L., et al.
Published: (2026)
Merging synthetic and real embryo data for advanced AI predictions
by: Presacan, Oriana, et al.
Published: (2024)
by: Presacan, Oriana, et al.
Published: (2024)
Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation
by: Jakobsen, Adam, et al.
Published: (2026)
by: Jakobsen, Adam, et al.
Published: (2026)
Synthetic Cardiac MRI Image Generation using Deep Generative Models
by: Kumarasinghe, Ishan, et al.
Published: (2026)
by: Kumarasinghe, Ishan, et al.
Published: (2026)
A Comparative Study of Decoding Strategies in Medical Text Generation
by: Presacan, Oriana, et al.
Published: (2025)
by: Presacan, Oriana, et al.
Published: (2025)
Open Set Recognition for Endoscopic Image Classification: A Deep Learning Approach on the Kvasir Dataset
by: Moazzami, Kasra, et al.
Published: (2025)
by: Moazzami, Kasra, et al.
Published: (2025)
Querying GI Endoscopy Images: A VQA Approach
by: Parajuli, Gaurav
Published: (2025)
by: Parajuli, Gaurav
Published: (2025)
Looking into Concept Explanation Methods for Diabetic Retinopathy Classification
by: Storås, Andrea M., et al.
Published: (2024)
by: Storås, Andrea M., et al.
Published: (2024)
SoccerGuard: Investigating Injury Risk Factors for Professional Soccer Players with Machine Learning
by: Bartels, Finn, et al.
Published: (2024)
by: Bartels, Finn, et al.
Published: (2024)
Using Large Language Models to Suggest Informative Prior Distributions in Bayesian Statistics
by: Riegler, Michael A., et al.
Published: (2025)
by: Riegler, Michael A., et al.
Published: (2025)
X-DECODE: EXtreme Deblurring with Curriculum Optimization and Domain Equalization
by: Gautam, Sushant, et al.
Published: (2025)
by: Gautam, Sushant, et al.
Published: (2025)
Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study
by: Edirisooriya, Madhura, et al.
Published: (2026)
by: Edirisooriya, Madhura, et al.
Published: (2026)
Medical Imaging AI Competitions Lack Fairness
by: Reinke, Annika, et al.
Published: (2025)
by: Reinke, Annika, et al.
Published: (2025)
Extracting Player Speed from Football Videos
by: Rustebakke, Ole Kristian, et al.
Published: (2025)
by: Rustebakke, Ole Kristian, et al.
Published: (2025)
Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption
by: Nik, Alireza, et al.
Published: (2025)
by: Nik, Alireza, et al.
Published: (2025)
DeepGI: An Automated Approach for Gastrointestinal Tract Segmentation in MRI Scans
by: Zhang, Ye, et al.
Published: (2024)
by: Zhang, Ye, et al.
Published: (2024)
Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation
by: Bause, Oliver, et al.
Published: (2025)
by: Bause, Oliver, et al.
Published: (2025)
Time-to-Injury Forecasting in Elite Female Football: A DeepHit Survival Approach
by: Catterall, Victoria, et al.
Published: (2026)
by: Catterall, Victoria, et al.
Published: (2026)
A Simple Data Augmentation Strategy for Text-in-Image Scientific VQA
by: Shoer, Belal, et al.
Published: (2025)
by: Shoer, Belal, et al.
Published: (2025)
Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges
by: Jha, Debesh, et al.
Published: (2023)
by: Jha, Debesh, et al.
Published: (2023)
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
by: Zhang, Yan, et al.
Published: (2024)
by: Zhang, Yan, et al.
Published: (2024)
PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery
by: He, Runlong, et al.
Published: (2024)
by: He, Runlong, et al.
Published: (2024)
Cross-Stage Coherence in Hierarchical Driving VQA: Explicit Baselines and Learned Gated Context Projectors
by: Jain, Gautam Kumar, et al.
Published: (2026)
by: Jain, Gautam Kumar, et al.
Published: (2026)
Similar Items
-
VideoHEDGE: Entropy-Based Hallucination Detection for Video-VLMs via Semantic Clustering and Spatiotemporal Perturbations
by: Gautam, Sushant, et al.
Published: (2026) -
SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding
by: Gautam, Sushant, et al.
Published: (2025) -
Medico 2025: Visual Question Answering for Gastrointestinal Imaging
by: Gautam, Sushant, et al.
Published: (2025) -
Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy
by: Gautam, Sushant, et al.
Published: (2025) -
HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)