:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bayramli, Zahra, Suleymanzade, Ayhan, An, Na Min, Ahmad, Huzama, Kim, Eunsu, Park, Junyeong, Thorne, James, Oh, Alice
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.08914
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts
by: Kim, Jun Seong, et al.
Published: (2025)

World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
by: Kim, Eunsu, et al.
Published: (2025)

Survey of Cultural Awareness in Language Models: Text and Beyond
by: Pawar, Siddhesh, et al.
Published: (2024)

I0T: Embedding Standardization Method Towards Zero Modality Gap
by: An, Na Min, et al.
Published: (2024)

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
by: Kim, Eunsu, et al.
Published: (2024)

Are Large Vision-Language Models Ready to Guide Blind and Low-Vision Individuals?
by: Kim, Eunki, et al.
Published: (2025)

Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
by: Baek, Eunsu, et al.
Published: (2024)

How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions
by: An, Na Min, et al.
Published: (2025)

QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering
by: Jung, Woojun, et al.
Published: (2026)

Adaptive Camera Sensor for Vision Models
by: Baek, Eunsu, et al.
Published: (2025)

Task Indicating Transformer for Task-conditional Dense Predictions
by: Lu, Yuxiang, et al.
Published: (2024)

CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
by: Na, Kihyun, et al.
Published: (2025)

Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
by: Kim, Eunsu, et al.
Published: (2025)

Improving Cone-Beam CT Image Quality with Knowledge Distillation-Enhanced Diffusion Model in Imbalanced Data Settings
by: Hwang, Joonil, et al.
Published: (2024)

Language-Grounded Multi-Domain Image Translation via Semantic Difference Guidance
by: Ryu, Jongwon, et al.
Published: (2026)

Machine learning approach to brain tumor detection and classification
by: Oh, Alice, et al.
Published: (2024)

Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models
by: Jung, Woojun, et al.
Published: (2025)

PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models
by: Marsocci, Valerio, et al.
Published: (2024)

Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
by: Kang, Wan Ju, et al.
Published: (2025)

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
by: Hong, Jiwoo, et al.
Published: (2024)

Key-point Guided Deformable Image Manipulation Using Diffusion Model
by: Oh, Seok-Hwan, et al.
Published: (2024)

IdenBAT: Disentangled Representation Learning for Identity-Preserved Brain Age Transformation
by: Maeng, Junyeong, et al.
Published: (2024)

Training Unbiased Diffusion Models From Biased Dataset
by: Kim, Yeongmin, et al.
Published: (2024)

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure
by: Ahn, Junyeong, et al.
Published: (2026)

Clinical-grade Multi-Organ Pathology Report Generation for Multi-scale Whole Slide Images via a Semantically Guided Medical Text Foundation Model
by: Tan, Jing Wei, et al.
Published: (2024)

LVMark: Robust Watermark for Latent Video Diffusion Models
by: Jang, MinHyuk, et al.
Published: (2024)

See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval
by: Jeon, Mingyu, et al.
Published: (2026)

Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)

DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
by: Ryu, Hyogon, et al.
Published: (2025)

Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
by: Oh, Seunghun, et al.
Published: (2026)

Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents
by: Oh, Juhyun, et al.
Published: (2025)

Vision-Language Models under Cultural and Inclusive Considerations
by: Karamolegkou, Antonia, et al.
Published: (2024)

Diffusion Model Compression for Image-to-Image Translation
by: Kim, Geonung, et al.
Published: (2024)

MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
by: Oh, Youngmin, et al.
Published: (2024)

Video Understanding: Through A Temporal Lens
by: Nguyen, Thong Thanh
Published: (2026)

Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
by: Min, Seonghui, et al.
Published: (2024)

Leveraging Prior Knowledge of Diffusion Model for Person Search
by: Kim, Giyeol, et al.
Published: (2025)

Preserve and Personalize: Personalized Text-to-Image Diffusion Models without Distributional Drift
by: Kim, Gihoon, et al.
Published: (2025)

Through the Lens of Character: Resolving Modality-Role Interference in Multimodal Role-Playing Agent
by: Tang, Yihong, et al.
Published: (2026)

Global Context-aware Representation Learning for Spatially Resolved Transcriptomics
by: Oh, Yunhak, et al.
Published: (2025)