:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gautam, Sushant, Storås, Andrea, Midoglu, Cise, Hicks, Steven A., Thambawita, Vajira, Halvorsen, Pål, Riegler, Michael A.
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.01437
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

VideoHEDGE: Entropy-Based Hallucination Detection for Video-VLMs via Semantic Clustering and Spatiotemporal Perturbations
by: Gautam, Sushant, et al.
Published: (2026)

SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding
by: Gautam, Sushant, et al.
Published: (2025)

Medico 2025: Visual Question Answering for Gastrointestinal Imaging
by: Gautam, Sushant, et al.
Published: (2025)

Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy
by: Gautam, Sushant, et al.
Published: (2025)

HEDGE: Hallucination Estimation via Dense Geometric Entropy for VQA with Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)

Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study
by: Sepasdar, Zahra, et al.
Published: (2024)

SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset
by: Gautam, Sushant, et al.
Published: (2024)

Point, Detect, Count: Multi-Task Medical Image Understanding with Instruction-Tuned Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)

PLayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips
by: Solberg, Håkon Maric, et al.
Published: (2024)

Multimodal Integration Challenges in Emotionally Expressive Child Avatars for Training Applications
by: Salehi, Pegah, et al.
Published: (2025)

SoccerRAG: Multimodal Soccer Information Retrieval via Natural Queries
by: Strand, Aleksander Theo, et al.
Published: (2024)

Demo: Soccer Information Retrieval via Natural Queries using SoccerRAG
by: Strand, Aleksander Theo, et al.
Published: (2024)

ECG-IMN: Interpretable Mesomorphic Neural Networks for 12-Lead Electrocardiogram Interpretation
by: Thambawita, Vajira, et al.
Published: (2026)

Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
by: Salehi, Pegah, et al.
Published: (2024)

Advancing sleep detection by modelling weak label sets: A novel weakly supervised learning approach
by: Boeker, Matthias, et al.
Published: (2024)

ExposureEngine: Oriented Logo Detection and Sponsor Visibility Analytics in Sports Broadcasts
by: Sarkhoosh, Mehdi Houshmand, et al.
Published: (2025)

Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models
by: Chaichuk, Mikhail, et al.
Published: (2025)

Calliope: A TTS-based Narrated E-book Creator Ensuring Exact Synchronization, Privacy, and Layout Fidelity
by: Hammer, Hugo L., et al.
Published: (2026)

Merging synthetic and real embryo data for advanced AI predictions
by: Presacan, Oriana, et al.
Published: (2024)

Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation
by: Jakobsen, Adam, et al.
Published: (2026)

Synthetic Cardiac MRI Image Generation using Deep Generative Models
by: Kumarasinghe, Ishan, et al.
Published: (2026)

A Comparative Study of Decoding Strategies in Medical Text Generation
by: Presacan, Oriana, et al.
Published: (2025)

Open Set Recognition for Endoscopic Image Classification: A Deep Learning Approach on the Kvasir Dataset
by: Moazzami, Kasra, et al.
Published: (2025)

Querying GI Endoscopy Images: A VQA Approach
by: Parajuli, Gaurav
Published: (2025)

Looking into Concept Explanation Methods for Diabetic Retinopathy Classification
by: Storås, Andrea M., et al.
Published: (2024)

SoccerGuard: Investigating Injury Risk Factors for Professional Soccer Players with Machine Learning
by: Bartels, Finn, et al.
Published: (2024)

Using Large Language Models to Suggest Informative Prior Distributions in Bayesian Statistics
by: Riegler, Michael A., et al.
Published: (2025)

X-DECODE: EXtreme Deblurring with Curriculum Optimization and Domain Equalization
by: Gautam, Sushant, et al.
Published: (2025)

Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study
by: Edirisooriya, Madhura, et al.
Published: (2026)

Medical Imaging AI Competitions Lack Fairness
by: Reinke, Annika, et al.
Published: (2025)

Extracting Player Speed from Football Videos
by: Rustebakke, Ole Kristian, et al.
Published: (2025)

Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption
by: Nik, Alireza, et al.
Published: (2025)

DeepGI: An Automated Approach for Gastrointestinal Tract Segmentation in MRI Scans
by: Zhang, Ye, et al.
Published: (2024)

Smart Video Capsule Endoscopy: Raw Image-Based Localization for Enhanced GI Tract Investigation
by: Bause, Oliver, et al.
Published: (2025)

Time-to-Injury Forecasting in Elite Female Football: A DeepHit Survival Approach
by: Catterall, Victoria, et al.
Published: (2026)

A Simple Data Augmentation Strategy for Text-in-Image Scientific VQA
by: Shoer, Belal, et al.
Published: (2025)

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges
by: Jha, Debesh, et al.
Published: (2023)

Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues
by: Zhang, Yan, et al.
Published: (2024)

PitVQA: Image-grounded Text Embedding LLM for Visual Question Answering in Pituitary Surgery
by: He, Runlong, et al.
Published: (2024)

Cross-Stage Coherence in Hierarchical Driving VQA: Explicit Baselines and Learned Gated Context Projectors
by: Jain, Gautam Kumar, et al.
Published: (2026)