Saved in:
| Main Authors: | Kim, Jun Seong, Thu, Kyaw Ye, Ismayilzada, Javad, Park, Junyeong, Kim, Eunsu, Ahmad, Huzama, An, Na Min, Thorne, James, Oh, Alice |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.16826 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Diffusion Models Through a Global Lens: Are They Culturally Inclusive?
by: Bayramli, Zahra, et al.
Published: (2025)
by: Bayramli, Zahra, et al.
Published: (2025)
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
by: Kim, Eunsu, et al.
Published: (2024)
by: Kim, Eunsu, et al.
Published: (2024)
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents
by: Oh, Juhyun, et al.
Published: (2025)
by: Oh, Juhyun, et al.
Published: (2025)
LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation
by: Park, Junyeong, et al.
Published: (2025)
by: Park, Junyeong, et al.
Published: (2025)
Multi-FAct: Assessing Factuality of Multilingual LLMs using FActScore
by: Shafayat, Sheikh, et al.
Published: (2024)
by: Shafayat, Sheikh, et al.
Published: (2024)
The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate
by: Oh, Juhyun, et al.
Published: (2024)
by: Oh, Juhyun, et al.
Published: (2024)
BLUCK: A Benchmark Dataset for Bengali Linguistic Understanding and Cultural Knowledge
by: Kabir, Daeen, et al.
Published: (2025)
by: Kabir, Daeen, et al.
Published: (2025)
Designing “Korean” Kimchi: Speculative Configuration of Distance and Commodity Value in the Chinese Kimchi Industry
by: Heangjin Park
Published: (2025)
by: Heangjin Park
Published: (2025)
JuICE: A Benchmark for Evaluating LLM-Judge in Identifying Cultural Errors
by: Jin, Jiho, et al.
Published: (2026)
by: Jin, Jiho, et al.
Published: (2026)
Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation
by: Shin, Jisu, et al.
Published: (2025)
by: Shin, Jisu, et al.
Published: (2025)
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
by: Kim, Eunsu, et al.
Published: (2024)
by: Kim, Eunsu, et al.
Published: (2024)
Survey of Cultural Awareness in Language Models: Text and Beyond
by: Pawar, Siddhesh, et al.
Published: (2024)
by: Pawar, Siddhesh, et al.
Published: (2024)
LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control
by: Jeong, Seogyeong, et al.
Published: (2026)
by: Jeong, Seogyeong, et al.
Published: (2026)
QEVA: A Reference-Free Evaluation Metric for Narrative Video Summarization with Multimodal Question Answering
by: Jung, Woojun, et al.
Published: (2026)
by: Jung, Woojun, et al.
Published: (2026)
Uncovering Factor Level Preferences to Improve Human-Model Alignment
by: Oh, Juhyun, et al.
Published: (2024)
by: Oh, Juhyun, et al.
Published: (2024)
Physicochemical Property Analyses of Deep‐Frozen Kimchi Cabbage during Long‐Term Storage
by: Dong Hyeon Park, et al.
Published: (2024)
by: Dong Hyeon Park, et al.
Published: (2024)
RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity
by: Shin, Jisu, et al.
Published: (2025)
by: Shin, Jisu, et al.
Published: (2025)
MUG-Eval: A Proxy Evaluation Framework for Multilingual Generation Capabilities in Any Language
by: Song, Seyoung, et al.
Published: (2025)
by: Song, Seyoung, et al.
Published: (2025)
BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation
by: Kim, Eunsu, et al.
Published: (2025)
by: Kim, Eunsu, et al.
Published: (2025)
Culture is Everywhere: A Call for Intentionally Cultural Evaluation
by: Oh, Juhyun, et al.
Published: (2025)
by: Oh, Juhyun, et al.
Published: (2025)
Understanding EFL Learners' Code-Switching and Teachers' Pedagogical Approaches in LLM-Supported Speaking Practice
by: Park, Junyeong, et al.
Published: (2025)
by: Park, Junyeong, et al.
Published: (2025)
Exposing Blindspots: Cultural Bias Evaluation in Generative Image Models
by: Seo, Huichan, et al.
Published: (2025)
by: Seo, Huichan, et al.
Published: (2025)
A Dual-Layered Evaluation of Geopolitical and Cultural Bias in LLMs
by: Kim, Sean, et al.
Published: (2025)
by: Kim, Sean, et al.
Published: (2025)
Spicy or Not? Exploring Kimchi's Spiciness Perception Across Spicy Food Tolerant and Sensitive Groups
by: Seo‐yeong Chon, et al.
Published: (2025)
by: Seo‐yeong Chon, et al.
Published: (2025)
Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
by: Baek, Eunsu, et al.
Published: (2024)
by: Baek, Eunsu, et al.
Published: (2024)
MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
by: Park, Junyeong, et al.
Published: (2024)
by: Park, Junyeong, et al.
Published: (2024)
On properness of moduli stacks of $D^{\times}$-shtukas over ramified legs
by: Choi, Yong-Gyu, et al.
Published: (2025)
by: Choi, Yong-Gyu, et al.
Published: (2025)
Influence of carcass mass on decomposition rate: A medico‐legal entomology perspective
by: Hyeon‐Seok Oh, et al.
Published: (2024)
by: Hyeon‐Seok Oh, et al.
Published: (2024)
68‐2: A New PWM Micro‐LED Pixel Circuit Using LTPO TFTs with Threshold Voltage and IR‐Drop Compensations
by: Junyeong Kim, et al.
Published: (2024)
by: Junyeong Kim, et al.
Published: (2024)
What, When, and Where America Eats
by: Sloan, E
Published: (2010)
by: Sloan, E
Published: (2010)
What, When, and Where America Eats
by: Sloan, E. A
Published: (2012)
by: Sloan, E. A
Published: (2012)
Context Filtering with Reward Modeling in Question Answering
by: Kim, Sangryul, et al.
Published: (2024)
by: Kim, Sangryul, et al.
Published: (2024)
One-Topic-Doesn't-Fit-All: Transcreating Reading Comprehension Test for Personalized Learning
by: Han, Jieun, et al.
Published: (2025)
by: Han, Jieun, et al.
Published: (2025)
To Eat or Not to Eat
by: Altmann, Peter, et al.
Published: (2024)
by: Altmann, Peter, et al.
Published: (2024)
Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning
by: Park, Seong-Il, et al.
Published: (2024)
by: Park, Seong-Il, et al.
Published: (2024)
Energy-Efficient Wireless LLM Inference via Uncertainty and Importance-Aware Speculative Decoding
by: Park, Jihoon, et al.
Published: (2025)
by: Park, Jihoon, et al.
Published: (2025)
AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift
by: Baek, Eunsu, et al.
Published: (2025)
by: Baek, Eunsu, et al.
Published: (2025)
Arithmetic BF theory and the Cassels-Tate pairing
by: Park, Jeehoon, et al.
Published: (2026)
by: Park, Jeehoon, et al.
Published: (2026)
Similar Items
-
Diffusion Models Through a Global Lens: Are They Culturally Inclusive?
by: Bayramli, Zahra, et al.
Published: (2025) -
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
by: Kim, Eunsu, et al.
Published: (2024) -
World in a Frame: Understanding Culture Mixing as a New Challenge for Vision-Language Models
by: Kim, Eunsu, et al.
Published: (2025) -
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English and Korean Dialogues
by: Kim, Eunsu, et al.
Published: (2025) -
Flex-TravelPlanner: A Benchmark for Flexible Planning with Language Agents
by: Oh, Juhyun, et al.
Published: (2025)