:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gwon, Hansle, Ahn, Imjin, Jung, Hyoje, Kim, Byeolhee, Kim, Young-Hak, Jun, Tae Joon
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2402.11883
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization
by: Ahn, Imjin, et al.
Published: (2024)

Multi-Response Preference Optimization with Augmented Ranking Dataset
by: Gwon, Hansle, et al.
Published: (2024)

Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation
by: Kim, Minkyoung, et al.
Published: (2024)

Enhancing Clinical Efficiency through LLM: Discharge Note Generation for Cardiac Patients
by: Jung, HyoJe, et al.
Published: (2024)

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
by: Sung-Bin, Kim, et al.
Published: (2024)

SurgX: Neuron-Concept Association for Explainable Surgical Phase Recognition
by: Kim, Ka Young, et al.
Published: (2025)

Ruling Out to Rule In: Contrastive Hypothesis Retrieval for Medical Question Answering
by: Kim, Byeolhee, et al.
Published: (2026)

Is 'Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning
by: Jung, Ji Hyeok, et al.
Published: (2024)

Normal and Abnormal Pathology Knowledge-Augmented Vision-Language Model for Anomaly Detection in Pathology Images
by: Song, Jinsol, et al.
Published: (2025)

SurgCheck: Do Vision-Language Models Really Look at Images in Surgical VQA?
by: Shin, Jongmin, et al.
Published: (2026)

WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts
by: Ahn, Yong Hyun, et al.
Published: (2024)

Mask-Free Neuron Concept Annotation for Interpreting Neural Networks in Medical Domain
by: Kim, Hyeon Bae, et al.
Published: (2024)

MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
by: Oh, Youngmin, et al.
Published: (2024)

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
by: Jung, Chaeyoung, et al.
Published: (2025)

Clinical-grade Multi-Organ Pathology Report Generation for Multi-scale Whole Slide Images via a Semantically Guided Medical Text Foundation Model
by: Tan, Jing Wei, et al.
Published: (2024)

Erase Persona, Forget Lore: Benchmarking Multimodal Copyright Unlearning in Large Vision Language Models
by: Kwon, JuneHyoung, et al.
Published: (2026)

Towards Holistic Surgical Scene Graph
by: Shin, Jongmin, et al.
Published: (2025)

Exploring Temporally-Aware Features for Point Tracking
by: Kim, Inès Hyeonsu, et al.
Published: (2025)

GuidNoise: Single-Pair Guided Diffusion for Generalized Noise Synthesis
by: Kim, Changjin, et al.
Published: (2025)

LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies
by: Hamza, Ameer, et al.
Published: (2024)

Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models
by: Jung, Chaeyoung, et al.
Published: (2025)

Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation
by: Cho, Yooshin, et al.
Published: (2025)

Resource-Efficient Medical Report Generation using Large Language Models
by: Abdullah, et al.
Published: (2024)

Learning Phonetic Context-Dependent Viseme for Enhancing Speech-Driven 3D Facial Animation
by: Kim, Hyung Kyu, et al.
Published: (2025)

PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond
by: Lee, Jongseo, et al.
Published: (2024)

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder
by: Kim, Jinseok, et al.
Published: (2024)

Toward Robust Canine Cardiac Diagnosis: Deep Prototype Alignment Network-Based Few-Shot Segmentation in Veterinary Medicine
by: Oh, Jun-Young, et al.
Published: (2024)

Visual Representation Alignment for Multimodal Large Language Models
by: Yoon, Heeji, et al.
Published: (2025)

Intriguing Properties of Large Language and Vision Models
by: Lee, Young-Jun, et al.
Published: (2024)

ArcVQ-VAE: A Spherical Vector Quantization Framework with ArcCosine Additive Margin
by: Kim, Jaeyung, et al.
Published: (2026)

Balancing Efficiency and Quality: MoEISR for Arbitrary-Scale Image Super-Resolution
by: Oh, Young Jae, et al.
Published: (2023)

Do We Need Perfect Data? Leveraging Noise for Domain Generalized Segmentation
by: Kim, Taeyeong, et al.
Published: (2025)

PDF-GS: Progressive Distractor Filtering for Robust 3D Gaussian Splatting
by: Seo, Kangmin, et al.
Published: (2026)

KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
by: Kim, Yoonshik, et al.
Published: (2025)

DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models
by: Min, Jaewon, et al.
Published: (2026)

IRASNet: Improved Feature-Level Clutter Reduction for Domain Generalized SAR-ATR
by: Jang, Oh-Tae, et al.
Published: (2024)

MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
by: Kim, Taeheon, et al.
Published: (2024)

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models
by: Kim, Hyungjin, et al.
Published: (2025)

Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering
by: Lim, Su Hyeon, et al.
Published: (2024)

Seurat: From Moving Points to Depth
by: Cho, Seokju, et al.
Published: (2025)