Saved in:
| Main Authors: | Song, Sangjun, Oh, Minjae, Lee, Seungkyu, Jo, Sungmin, Jo, Yohan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.00546 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KL for a KL: On-Policy Distillation with Control Variate Baseline
by: Oh, Minjae, et al.
Published: (2026)
by: Oh, Minjae, et al.
Published: (2026)
In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents
by: Lee, Seungkyu, et al.
Published: (2025)
by: Lee, Seungkyu, et al.
Published: (2025)
Future Policy Approximation for Offline Reinforcement Learning Improves Mathematical Reasoning
by: Oh, Minjae, et al.
Published: (2025)
by: Oh, Minjae, et al.
Published: (2025)
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States
by: Choi, Yunho, et al.
Published: (2026)
by: Choi, Yunho, et al.
Published: (2026)
Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement
by: Kong, Injin, et al.
Published: (2026)
by: Kong, Injin, et al.
Published: (2026)
Thinking Like a Doctor: Conversational Diagnosis through the Exploration of Diagnostic Knowledge Graphs
by: Won, Jeongmoon, et al.
Published: (2026)
by: Won, Jeongmoon, et al.
Published: (2026)
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction
by: Lee, Yooseop, et al.
Published: (2025)
by: Lee, Yooseop, et al.
Published: (2025)
Quantifying Data Contamination in Psychometric Evaluations of LLMs
by: Han, Jongwook, et al.
Published: (2025)
by: Han, Jongwook, et al.
Published: (2025)
Mechanism Shift During Post-training from Autoregressive to Masked Diffusion Language Models
by: Kong, Injin, et al.
Published: (2026)
by: Kong, Injin, et al.
Published: (2026)
Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators
by: Lim, Sungjib, et al.
Published: (2025)
by: Lim, Sungjib, et al.
Published: (2025)
Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models
by: Lee, Jonggeun, et al.
Published: (2025)
by: Lee, Jonggeun, et al.
Published: (2025)
SpeakerSleuth: Can Large Audio-Language Models Judge Speaker Consistency across Multi-turn Dialogues?
by: Lee, Jonggeun, et al.
Published: (2026)
by: Lee, Jonggeun, et al.
Published: (2026)
SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue
by: Lee, Jonggeun, et al.
Published: (2026)
by: Lee, Jonggeun, et al.
Published: (2026)
Bridging the Knowledge-Prediction Gap in LLMs on Multiple-Choice Questions
by: Park, Yoonah, et al.
Published: (2025)
by: Park, Yoonah, et al.
Published: (2025)
SUIT: Knowledge Editing with Subspace-Aware Key-Value Mappings
by: Park, Haewon, et al.
Published: (2025)
by: Park, Haewon, et al.
Published: (2025)
Stress-Testing Emotional Support Models: Moving from Homogeneous to Diverse Help Seekers
by: Heo, Chaewon, et al.
Published: (2026)
by: Heo, Chaewon, et al.
Published: (2026)
Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning
by: Song, Jiwon, et al.
Published: (2025)
by: Song, Jiwon, et al.
Published: (2025)
Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples
by: Pyun, Haesung, et al.
Published: (2025)
by: Pyun, Haesung, et al.
Published: (2025)
ToolDial: Multi-turn Dialogue Generation Method for Tool-Augmented Language Models
by: Shim, Jeonghoon, et al.
Published: (2025)
by: Shim, Jeonghoon, et al.
Published: (2025)
Dialogue Systems for Emotional Support via Value Reinforcement
by: Kim, Juhee, et al.
Published: (2025)
by: Kim, Juhee, et al.
Published: (2025)
Rewarding How Models Think Pedagogically: Integrating Pedagogical Reasoning and Thinking Rewards for LLMs in Education
by: Lee, Unggi, et al.
Published: (2026)
by: Lee, Unggi, et al.
Published: (2026)
Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items
by: Han, Jongwook, et al.
Published: (2025)
by: Han, Jongwook, et al.
Published: (2025)
Non-Collaborative User Simulators for Tool Agents
by: Shim, Jeonghoon, et al.
Published: (2025)
by: Shim, Jeonghoon, et al.
Published: (2025)
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
by: Choi, Jaepill, et al.
Published: (2024)
by: Choi, Jaepill, et al.
Published: (2024)
Sparse and Dense Retrievers Learn Better Together: Joint Sparse-Dense Optimization for Text-Image Retrieval
by: Song, Jonghyun, et al.
Published: (2025)
by: Song, Jonghyun, et al.
Published: (2025)
Human Psychometric Questionnaires Mischaracterize LLM Behavior
by: Song, Woojung, et al.
Published: (2025)
by: Song, Woojung, et al.
Published: (2025)
Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model
by: Kim, Taehee, et al.
Published: (2024)
by: Kim, Taehee, et al.
Published: (2024)
Context-Robust Knowledge Editing for Language Models
by: Park, Haewon, et al.
Published: (2025)
by: Park, Haewon, et al.
Published: (2025)
Mitigating Hallucination in Abstractive Summarization with Domain-Conditional Mutual Information
by: Chae, Kyubyung, et al.
Published: (2024)
by: Chae, Kyubyung, et al.
Published: (2024)
Learning to Retrieve User History and Generate User Profiles for Personalized Persuasiveness Prediction
by: Park, Sejun, et al.
Published: (2026)
by: Park, Sejun, et al.
Published: (2026)
Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem
by: Jo, Heejin
Published: (2026)
by: Jo, Heejin
Published: (2026)
SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents
by: Seo, Gyuhyeon, et al.
Published: (2025)
by: Seo, Gyuhyeon, et al.
Published: (2025)
Personalized LLM Decoding via Contrasting Personal Preference
by: Bu, Hyungjune, et al.
Published: (2025)
by: Bu, Hyungjune, et al.
Published: (2025)
FastKV: Decoupling of Context Reduction and KV Cache Compression for Prefill-Decoding Acceleration
by: Jo, Dongwon, et al.
Published: (2025)
by: Jo, Dongwon, et al.
Published: (2025)
PVP: An Image Dataset for Personalized Visual Persuasion with Persuasion Strategies, Viewer Characteristics, and Persuasiveness Ratings
by: Kim, Junseo, et al.
Published: (2025)
by: Kim, Junseo, et al.
Published: (2025)
ReGUIDE: Data Efficient GUI Grounding via Spatial Reasoning and Search
by: Lee, Hyunseok, et al.
Published: (2025)
by: Lee, Hyunseok, et al.
Published: (2025)
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think
by: Lee, Seongyun, et al.
Published: (2025)
by: Lee, Seongyun, et al.
Published: (2025)
A2SF: Accumulative Attention Scoring with Forgetting Factor for Token Pruning in Transformer Decoder
by: Jo, Hyun-rae, et al.
Published: (2024)
by: Jo, Hyun-rae, et al.
Published: (2024)
TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs
by: Lee, Minjae, et al.
Published: (2026)
by: Lee, Minjae, et al.
Published: (2026)
LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation
by: Park, Junyeong, et al.
Published: (2025)
by: Park, Junyeong, et al.
Published: (2025)
Similar Items
-
KL for a KL: On-Policy Distillation with Control Variate Baseline
by: Oh, Minjae, et al.
Published: (2026) -
In-N-Out: A Parameter-Level API Graph Dataset for Tool Agents
by: Lee, Seungkyu, et al.
Published: (2025) -
Future Policy Approximation for Offline Reinforcement Learning Improves Mathematical Reasoning
by: Oh, Minjae, et al.
Published: (2025) -
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States
by: Choi, Yunho, et al.
Published: (2026) -
Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement
by: Kong, Injin, et al.
Published: (2026)