Saved in:
| Main Authors: | Kim, Seungyoon, Kim, Seungone |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.17022 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KMMLU: Measuring Massive Multitask Language Understanding in Korean
by: Son, Guijin, et al.
Published: (2024)
by: Son, Guijin, et al.
Published: (2024)
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
by: Son, Guijin, et al.
Published: (2024)
by: Son, Guijin, et al.
Published: (2024)
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
by: Lee, Seongyun, et al.
Published: (2024)
by: Lee, Seongyun, et al.
Published: (2024)
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
by: Kim, Seoyeon, et al.
Published: (2024)
by: Kim, Seoyeon, et al.
Published: (2024)
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
by: Son, Guijin, et al.
Published: (2023)
by: Son, Guijin, et al.
Published: (2023)
Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards
by: Hwang, Hyeonbin, et al.
Published: (2024)
by: Hwang, Hyeonbin, et al.
Published: (2024)
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
by: Ye, Seonghyeon, et al.
Published: (2023)
by: Ye, Seonghyeon, et al.
Published: (2023)
LLM-as-a-tutor in EFL Writing Education: Focusing on Evaluation of Student-LLM Interaction
by: Han, Jieun, et al.
Published: (2023)
by: Han, Jieun, et al.
Published: (2023)
Evaluating Multimodal Generative AI with Korean Educational Standards
by: Park, Sanghee, et al.
Published: (2025)
by: Park, Sanghee, et al.
Published: (2025)
LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?
by: Zhang, Qihui, et al.
Published: (2024)
by: Zhang, Qihui, et al.
Published: (2024)
Evaluating Multimodal Large Language Models on Vertically Written Japanese Text
by: Sasagawa, Keito, et al.
Published: (2025)
by: Sasagawa, Keito, et al.
Published: (2025)
Measuring Sycophancy of Language Models in Multi-turn Dialogues
by: Hong, Jiseung, et al.
Published: (2025)
by: Hong, Jiseung, et al.
Published: (2025)
Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean
by: Choi, ChangSu, et al.
Published: (2024)
by: Choi, ChangSu, et al.
Published: (2024)
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
by: Kim, Eunsu, et al.
Published: (2024)
by: Kim, Eunsu, et al.
Published: (2024)
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts
by: Lee, Nahyun, et al.
Published: (2026)
by: Lee, Nahyun, et al.
Published: (2026)
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models
by: Jung, Dahyun, et al.
Published: (2025)
by: Jung, Dahyun, et al.
Published: (2025)
RefineBench: Evaluating Refinement Capability of Language Models via Checklists
by: Lee, Young-Jun, et al.
Published: (2025)
by: Lee, Young-Jun, et al.
Published: (2025)
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
by: Park, Chanjun, et al.
Published: (2024)
by: Park, Chanjun, et al.
Published: (2024)
Evaluating Language Models as Synthetic Data Generators
by: Kim, Seungone, et al.
Published: (2024)
by: Kim, Seungone, et al.
Published: (2024)
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
by: Kim, Seungone, et al.
Published: (2024)
by: Kim, Seungone, et al.
Published: (2024)
KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language
by: Kim, Yoonshik, et al.
Published: (2025)
by: Kim, Yoonshik, et al.
Published: (2025)
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
by: Kim, Seungone, et al.
Published: (2023)
by: Kim, Seungone, et al.
Published: (2023)
VARCO-VISION: Expanding Frontiers in Korean Vision-Language Models
by: Ju, Jeongho, et al.
Published: (2024)
by: Ju, Jeongho, et al.
Published: (2024)
UKTA: Unified Korean Text Analyzer
by: Ahn, Seokho, et al.
Published: (2025)
by: Ahn, Seokho, et al.
Published: (2025)
GECKO: Generative Language Model for English, Code and Korean
by: Oh, Sungwoo, et al.
Published: (2024)
by: Oh, Sungwoo, et al.
Published: (2024)
Expanding Foundational Language Capabilities in Open-Source LLMs through a Korean Case Study
by: Lim, Junghwan, et al.
Published: (2025)
by: Lim, Junghwan, et al.
Published: (2025)
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
by: Chae, Hyungjoo, et al.
Published: (2024)
by: Chae, Hyungjoo, et al.
Published: (2024)
KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models
by: Kim, Dongjun, et al.
Published: (2025)
by: Kim, Dongjun, et al.
Published: (2025)
Building Resource-Constrained Language Agents: A Korean Case Study on Chemical Toxicity Information
by: Cho, Hojun, et al.
Published: (2025)
by: Cho, Hojun, et al.
Published: (2025)
Can Large Language Models Automatically Score Proficiency of Written Essays?
by: Mansour, Watheq, et al.
Published: (2024)
by: Mansour, Watheq, et al.
Published: (2024)
Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers
by: Kim, Jong Myoung, et al.
Published: (2024)
by: Kim, Jong Myoung, et al.
Published: (2024)
ChEDDAR: Student-ChatGPT Dialogue in EFL Writing Education
by: Han, Jieun, et al.
Published: (2023)
by: Han, Jieun, et al.
Published: (2023)
Chain-of-MetaWriting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts
by: Buhnila, Ioana, et al.
Published: (2024)
by: Buhnila, Ioana, et al.
Published: (2024)
Human-AI Collaborative Taxonomy Construction: A Case Study in Profession-Specific Writing Assistants
by: Lee, Minhwa, et al.
Published: (2024)
by: Lee, Minhwa, et al.
Published: (2024)
Theme-Explanation Structure for Table Summarization using Large Language Models: A Case Study on Korean Tabular Data
by: Kwack, TaeYoon, et al.
Published: (2025)
by: Kwack, TaeYoon, et al.
Published: (2025)
Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer
by: Lee, Seungyoon, et al.
Published: (2025)
by: Lee, Seungyoon, et al.
Published: (2025)
Aligning to Thousands of Preferences via System Message Generalization
by: Lee, Seongyun, et al.
Published: (2024)
by: Lee, Seongyun, et al.
Published: (2024)
Linguistically Informed Graph Model and Semantic Contrastive Learning for Korean Short Text Classification
by: Yoo, JaeGeon, et al.
Published: (2026)
by: Yoo, JaeGeon, et al.
Published: (2026)
Reasoning Models Better Express Their Confidence
by: Yoon, Dongkeun, et al.
Published: (2025)
by: Yoon, Dongkeun, et al.
Published: (2025)
Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model
by: Kim, Taehee, et al.
Published: (2024)
by: Kim, Taehee, et al.
Published: (2024)
Similar Items
-
KMMLU: Measuring Massive Multitask Language Understanding in Korean
by: Son, Guijin, et al.
Published: (2024) -
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
by: Son, Guijin, et al.
Published: (2024) -
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
by: Lee, Seongyun, et al.
Published: (2024) -
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching
by: Kim, Seoyeon, et al.
Published: (2024) -
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
by: Son, Guijin, et al.
Published: (2023)