Saved in:
| Main Authors: | Han, Hojae, Jung, Heeyun, Kim, Jongyoon, Hwang, Seung-won |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.21699 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Benchmarking Testing in Automated Theorem Proving
by: Kim, Jongyoon, et al.
Published: (2026)
by: Kim, Jongyoon, et al.
Published: (2026)
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
by: Han, Hojae, et al.
Published: (2025)
by: Han, Hojae, et al.
Published: (2025)
DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode
by: Han, Hojae, et al.
Published: (2026)
by: Han, Hojae, et al.
Published: (2026)
PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation
by: Yoo, Jaeseok, et al.
Published: (2024)
by: Yoo, Jaeseok, et al.
Published: (2024)
ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models
by: Han, Hojae, et al.
Published: (2024)
by: Han, Hojae, et al.
Published: (2024)
Dual-Scale World Models for LLM Agents Towards Hard-Exploration Problems
by: Kim, Minsoo, et al.
Published: (2025)
by: Kim, Minsoo, et al.
Published: (2025)
R$^3$-SQL: Ranking Reward and Resampling for Text-to-SQL
by: Han, Hojae, et al.
Published: (2026)
by: Han, Hojae, et al.
Published: (2026)
SAFE: Stepwise Atomic Feedback for Error correction in Multi-hop Reasoning
by: Kwon, Daeyong, et al.
Published: (2026)
by: Kwon, Daeyong, et al.
Published: (2026)
CREFT: Sequential Multi-Agent LLM for Character Relation Extraction
by: Chun, Ye Eun, et al.
Published: (2025)
by: Chun, Ye Eun, et al.
Published: (2025)
Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models
by: Kim, Jongho, et al.
Published: (2025)
by: Kim, Jongho, et al.
Published: (2025)
CoEx -- Co-evolving World-model and Exploration
by: Kim, Minsoo, et al.
Published: (2025)
by: Kim, Minsoo, et al.
Published: (2025)
UnIte: Uncertainty-based Iterative Document Sampling for Domain Adaptation in Information Retrieval
by: Kim, Jongyoon, et al.
Published: (2026)
by: Kim, Jongyoon, et al.
Published: (2026)
Agent-as-Judge for Factual Summarization of Long Narratives
by: Jeong, Yeonseok, et al.
Published: (2025)
by: Jeong, Yeonseok, et al.
Published: (2025)
Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context
by: Yu, Sangwon, et al.
Published: (2024)
by: Yu, Sangwon, et al.
Published: (2024)
Intended Target Identification for Anomia Patients with Gradient-based Selective Augmentation
by: Kim, Jongho, et al.
Published: (2025)
by: Kim, Jongho, et al.
Published: (2025)
Disentangling Questions from Query Generation for Task-Adaptive Retrieval
by: Lee, Yoonsang, et al.
Published: (2024)
by: Lee, Yoonsang, et al.
Published: (2024)
Analyzing the Effectiveness of Listwise Reranking with Positional Invariance on Temporal Generalizability
by: Yoon, Soyoung, et al.
Published: (2024)
by: Yoon, Soyoung, et al.
Published: (2024)
ComposeRAG: A Modular and Composable RAG for Corpus-Grounded Multi-Hop Question Answering
by: Wu, Ruofan, et al.
Published: (2025)
by: Wu, Ruofan, et al.
Published: (2025)
Chaining Event Spans for Temporal Relation Grounding
by: Kim, Jongho, et al.
Published: (2025)
by: Kim, Jongho, et al.
Published: (2025)
HARP: Hesitation-Aware Reframing in Transformer Inference Pass
by: Storaï, Romain, et al.
Published: (2024)
by: Storaï, Romain, et al.
Published: (2024)
Chain of Grounded Objectives: Bridging Process and Goal-oriented Prompting for Code Generation
by: Yeo, Sangyeop, et al.
Published: (2025)
by: Yeo, Sangyeop, et al.
Published: (2025)
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
by: Lee, Hojae, et al.
Published: (2024)
by: Lee, Hojae, et al.
Published: (2024)
ECoRAG: Evidentiality-guided Compression for Long Context RAG
by: Jeong, Yeonseok, et al.
Published: (2025)
by: Jeong, Yeonseok, et al.
Published: (2025)
Relevance to Utility: Process-Supervised Rewrite for RAG
by: Kim, Jaeyoung, et al.
Published: (2025)
by: Kim, Jaeyoung, et al.
Published: (2025)
Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding
by: Jung, Yeonjoon, et al.
Published: (2024)
by: Jung, Yeonjoon, et al.
Published: (2024)
David helps Goliath: Inference-Time Collaboration Between Small Specialized and Large General Diffusion LMs
by: Han, Xiaochuang, et al.
Published: (2023)
by: Han, Xiaochuang, et al.
Published: (2023)
DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections
by: Park, Jiwon, et al.
Published: (2025)
by: Park, Jiwon, et al.
Published: (2025)
AcuRank: Uncertainty-Aware Adaptive Computation for Listwise Reranking
by: Yoon, Soyoung, et al.
Published: (2025)
by: Yoon, Soyoung, et al.
Published: (2025)
BioHopR: A Benchmark for Multi-Hop, Multi-Answer Reasoning in Biomedical Domain
by: Kim, Yunsoo, et al.
Published: (2025)
by: Kim, Yunsoo, et al.
Published: (2025)
RoToR: Towards More Reliable Responses for Order-Invariant Inputs
by: Yoon, Soyoung, et al.
Published: (2025)
by: Yoon, Soyoung, et al.
Published: (2025)
LLM Agents at the Roundtable: A Multi-Perspective and Dialectical Reasoning Framework for Essay Scoring
by: Jang, Jinhee, et al.
Published: (2025)
by: Jang, Jinhee, et al.
Published: (2025)
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning
by: Kim, Seunghee, et al.
Published: (2024)
by: Kim, Seunghee, et al.
Published: (2024)
Towards Lifelong Dialogue Agents via Timeline-based Memory Management
by: Ong, Kai Tzu-iunn, et al.
Published: (2024)
by: Ong, Kai Tzu-iunn, et al.
Published: (2024)
Ever-Evolving Memory by Blending and Refining the Past
by: Kim, Seo Hyun, et al.
Published: (2024)
by: Kim, Seo Hyun, et al.
Published: (2024)
Inference Scaling for Bridging Retrieval and Augmented Generation
by: Lee, Youngwon, et al.
Published: (2024)
by: Lee, Youngwon, et al.
Published: (2024)
CORD: Balancing COnsistency and Rank Distillation for Robust Retrieval-Augmented Generation
by: Lee, Youngwon, et al.
Published: (2024)
by: Lee, Youngwon, et al.
Published: (2024)
OMHBench: Benchmarking Balanced and Grounded Omni-Modal Multi-Hop Reasoning
by: Kim, Seunghee, et al.
Published: (2025)
by: Kim, Seunghee, et al.
Published: (2025)
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining
by: Wei, Yuxiang, et al.
Published: (2024)
by: Wei, Yuxiang, et al.
Published: (2024)
TRUEBench: Can LLM Response Meet Real-world Constraints as Productivity Assistant?
by: Park, Jiho, et al.
Published: (2025)
by: Park, Jiho, et al.
Published: (2025)
A Multi-Agent Framework for Feature-Constrained Difficulty Control in Reading Comprehension Item Generation
by: Hwang, Seonjeong, et al.
Published: (2026)
by: Hwang, Seonjeong, et al.
Published: (2026)
Similar Items
-
Benchmarking Testing in Automated Theorem Proving
by: Kim, Jongyoon, et al.
Published: (2026) -
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
by: Han, Hojae, et al.
Published: (2025) -
DuET: Dual Execution for Test Output Prediction with Generated Code and Pseudocode
by: Han, Hojae, et al.
Published: (2026) -
PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation
by: Yoo, Jaeseok, et al.
Published: (2024) -
ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models
by: Han, Hojae, et al.
Published: (2024)