Saved in:
| Main Authors: | Bayramli, Zahra, Suleymanzade, Ayhan, An, Na Min, Ahmad, Huzama, Kim, Eunsu, Park, Junyeong, Thorne, James, Oh, Alice |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.08914 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts
by: Kim, Jun Seong, et al.
Published: (2025)
by: Kim, Jun Seong, et al.
Published: (2025)
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Survey of Cultural Awareness in Language Models: Text and Beyond
by: Pawar, Siddhesh, et al.
Published: (2024)
by: Pawar, Siddhesh, et al.
Published: (2024)
I0T: Embedding Standardization Method Towards Zero Modality Gap
by: An, Na Min, et al.
Published: (2024)
by: An, Na Min, et al.
Published: (2024)
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
by: Kim, Eunsu, et al.
Published: (2024)
by: Kim, Eunsu, et al.
Published: (2024)
Are Large Vision-Language Models Ready to Guide Blind and Low-Vision Individuals?
by: Kim, Eunki, et al.
Published: (2025)
by: Kim, Eunki, et al.
Published: (2025)
Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
by: Baek, Eunsu, et al.
Published: (2024)
by: Baek, Eunsu, et al.
Published: (2024)
How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions
by: An, Na Min, et al.
Published: (2025)
by: An, Na Min, et al.
Published: (2025)
QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering
by: Jung, Woojun, et al.
Published: (2026)
by: Jung, Woojun, et al.
Published: (2026)
Adaptive Camera Sensor for Vision Models
by: Baek, Eunsu, et al.
Published: (2025)
by: Baek, Eunsu, et al.
Published: (2025)
Task Indicating Transformer for Task-conditional Dense Predictions
by: Lu, Yuxiang, et al.
Published: (2024)
by: Lu, Yuxiang, et al.
Published: (2024)
CharDiff-LP: A Diffusion Model with Character-Level Guidance for License Plate Image Restoration
by: Na, Kihyun, et al.
Published: (2025)
by: Na, Kihyun, et al.
Published: (2025)
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Improving Cone-Beam CT Image Quality with Knowledge Distillation-Enhanced Diffusion Model in Imbalanced Data Settings
by: Hwang, Joonil, et al.
Published: (2024)
by: Hwang, Joonil, et al.
Published: (2024)
Language-Grounded Multi-Domain Image Translation via Semantic Difference Guidance
by: Ryu, Jongwon, et al.
Published: (2026)
by: Ryu, Jongwon, et al.
Published: (2026)
Machine learning approach to brain tumor detection and classification
by: Oh, Alice, et al.
Published: (2024)
by: Oh, Alice, et al.
Published: (2024)
Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models
by: Jung, Woojun, et al.
Published: (2025)
by: Jung, Woojun, et al.
Published: (2025)
PANGAEA: A Global and Inclusive Benchmark for Geospatial Foundation Models
by: Marsocci, Valerio, et al.
Published: (2024)
by: Marsocci, Valerio, et al.
Published: (2024)
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
by: Kang, Wan Ju, et al.
Published: (2025)
by: Kang, Wan Ju, et al.
Published: (2025)
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
by: Hong, Jiwoo, et al.
Published: (2024)
by: Hong, Jiwoo, et al.
Published: (2024)
Key-point Guided Deformable Image Manipulation Using Diffusion Model
by: Oh, Seok-Hwan, et al.
Published: (2024)
by: Oh, Seok-Hwan, et al.
Published: (2024)
IdenBAT: Disentangled Representation Learning for Identity-Preserved Brain Age Transformation
by: Maeng, Junyeong, et al.
Published: (2024)
by: Maeng, Junyeong, et al.
Published: (2024)
Training Unbiased Diffusion Models From Biased Dataset
by: Kim, Yeongmin, et al.
Published: (2024)
by: Kim, Yeongmin, et al.
Published: (2024)
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure
by: Ahn, Junyeong, et al.
Published: (2026)
by: Ahn, Junyeong, et al.
Published: (2026)
Clinical-grade Multi-Organ Pathology Report Generation for Multi-scale Whole Slide Images via a Semantically Guided Medical Text Foundation Model
by: Tan, Jing Wei, et al.
Published: (2024)
by: Tan, Jing Wei, et al.
Published: (2024)
LVMark: Robust Watermark for Latent Video Diffusion Models
by: Jang, MinHyuk, et al.
Published: (2024)
by: Jang, MinHyuk, et al.
Published: (2024)
See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval
by: Jeon, Mingyu, et al.
Published: (2026)
by: Jeon, Mingyu, et al.
Published: (2026)
Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)
by: Min, Jaewon, et al.
Published: (2025)
DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models
by: Ryu, Hyogon, et al.
Published: (2025)
by: Ryu, Hyogon, et al.
Published: (2025)
Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
by: Oh, Seunghun, et al.
Published: (2026)
by: Oh, Seunghun, et al.
Published: (2026)
Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents
by: Oh, Juhyun, et al.
Published: (2025)
by: Oh, Juhyun, et al.
Published: (2025)
Vision-Language Models under Cultural and Inclusive Considerations
by: Karamolegkou, Antonia, et al.
Published: (2024)
by: Karamolegkou, Antonia, et al.
Published: (2024)
Diffusion Model Compression for Image-to-Image Translation
by: Kim, Geonung, et al.
Published: (2024)
by: Kim, Geonung, et al.
Published: (2024)
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection
by: Oh, Youngmin, et al.
Published: (2024)
by: Oh, Youngmin, et al.
Published: (2024)
Video Understanding: Through A Temporal Lens
by: Nguyen, Thong Thanh
Published: (2026)
by: Nguyen, Thong Thanh
Published: (2026)
Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model
by: Min, Seonghui, et al.
Published: (2024)
by: Min, Seonghui, et al.
Published: (2024)
Leveraging Prior Knowledge of Diffusion Model for Person Search
by: Kim, Giyeol, et al.
Published: (2025)
by: Kim, Giyeol, et al.
Published: (2025)
Preserve and Personalize: Personalized Text-to-Image Diffusion Models without Distributional Drift
by: Kim, Gihoon, et al.
Published: (2025)
by: Kim, Gihoon, et al.
Published: (2025)
Through the Lens of Character: Resolving Modality-Role Interference in Multimodal Role-Playing Agent
by: Tang, Yihong, et al.
Published: (2026)
by: Tang, Yihong, et al.
Published: (2026)
Global Context-aware Representation Learning for Spatially Resolved Transcriptomics
by: Oh, Yunhak, et al.
Published: (2025)
by: Oh, Yunhak, et al.
Published: (2025)
Similar Items
-
When Tom Eats Kimchi: Evaluating Cultural Bias of Multimodal Large Language Models in Cultural Mixture Contexts
by: Kim, Jun Seong, et al.
Published: (2025) -
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
by: Kim, Eunsu, et al.
Published: (2025) -
Survey of Cultural Awareness in Language Models: Text and Beyond
by: Pawar, Siddhesh, et al.
Published: (2024) -
I0T: Embedding Standardization Method Towards Zero Modality Gap
by: An, Na Min, et al.
Published: (2024) -
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
by: Kim, Eunsu, et al.
Published: (2024)