Saved in:
| Main Authors: | Xiaohu, Xie, Xiaohu, Liu, Benjamin, Yao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06604 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CaRT: Teaching LLM Agents to Know When They Know Enough
by: Liu, Grace, et al.
Published: (2025)
by: Liu, Grace, et al.
Published: (2025)
Confidence Estimation for Error Detection in Text-to-SQL Systems
by: Somov, Oleg, et al.
Published: (2025)
by: Somov, Oleg, et al.
Published: (2025)
Your LLM Knows the Future: Uncovering Its Multi-Token Prediction Potential
by: Samragh, Mohammad, et al.
Published: (2025)
by: Samragh, Mohammad, et al.
Published: (2025)
Show Your Work with Confidence: Confidence Bands for Tuning Curves
by: Lourie, Nicholas, et al.
Published: (2023)
by: Lourie, Nicholas, et al.
Published: (2023)
Do Androids Know They're Only Dreaming of Electric Sheep?
by: CH-Wang, Sky, et al.
Published: (2023)
by: CH-Wang, Sky, et al.
Published: (2023)
Knowing When to Ask: Segment-Level Credit Assignment for LLM Tool Use
by: Kumar, Abhijit, et al.
Published: (2026)
by: Kumar, Abhijit, et al.
Published: (2026)
Process Supervision of Confidence Margin for Calibrated LLM Reasoning
by: Wang, Liaoyaqi, et al.
Published: (2026)
by: Wang, Liaoyaqi, et al.
Published: (2026)
Divide-or-Conquer? Which Part Should You Distill Your LLM?
by: Wu, Zhuofeng, et al.
Published: (2024)
by: Wu, Zhuofeng, et al.
Published: (2024)
Know Your Limits: Entropy Estimation Modeling for Compression and Generalization
by: Badger, Benjamin L., et al.
Published: (2025)
by: Badger, Benjamin L., et al.
Published: (2025)
Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework
by: Bhatt, Neel P., et al.
Published: (2024)
by: Bhatt, Neel P., et al.
Published: (2024)
NanoKnow: How to Know What Your Language Model Knows
by: Gu, Lingwei, et al.
Published: (2026)
by: Gu, Lingwei, et al.
Published: (2026)
Your Simulation Runs but Solves the Wrong Physics: PDE-Grounded Intent Verification for LLM-Generated Multiphysics Simulation Code
by: Song, Zhenghan, et al.
Published: (2026)
by: Song, Zhenghan, et al.
Published: (2026)
Knowing When to Defer: Selective Prediction for Responsible Knowledge Tracing
by: Mitton, Joshua, et al.
Published: (2025)
by: Mitton, Joshua, et al.
Published: (2025)
Confidence Geometry Reveals Trace-Level Correctness in Large Language Model Reasoning
by: Liu, Shuo, et al.
Published: (2026)
by: Liu, Shuo, et al.
Published: (2026)
Do You Know What You Are Talking About? Characterizing Query-Knowledge Relevance For Reliable Retrieval Augmented Generation
by: Li, Zhuohang, et al.
Published: (2024)
by: Li, Zhuohang, et al.
Published: (2024)
To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity
by: Sedova, Anastasiia, et al.
Published: (2024)
by: Sedova, Anastasiia, et al.
Published: (2024)
Do Small Language Models Know When They're Wrong? Confidence-Based Cascade Scoring for Educational Assessment
by: Burleigh, Tyler
Published: (2026)
by: Burleigh, Tyler
Published: (2026)
Are LLM Decisions Faithful to Verbal Confidence?
by: Wang, Jiawei, et al.
Published: (2026)
by: Wang, Jiawei, et al.
Published: (2026)
A Confidence-based Acquisition Model for Self-supervised Active Learning and Label Correction
by: van Niekerk, Carel, et al.
Published: (2023)
by: van Niekerk, Carel, et al.
Published: (2023)
DRA-GRPO: Your GRPO Needs to Know Diverse Reasoning Paths for Mathematical Reasoning
by: Chen, Xiwen, et al.
Published: (2025)
by: Chen, Xiwen, et al.
Published: (2025)
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
by: Lee, Hyunseok, et al.
Published: (2024)
by: Lee, Hyunseok, et al.
Published: (2024)
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
by: Li, Pengyi, et al.
Published: (2025)
by: Li, Pengyi, et al.
Published: (2025)
What is Wrong with Perplexity for Long-context Language Modeling?
by: Fang, Lizhe, et al.
Published: (2024)
by: Fang, Lizhe, et al.
Published: (2024)
Knowing What You Know Is Not Enough: Large Language Model Confidences Don't Align With Their Actions
by: Pal, Arka, et al.
Published: (2025)
by: Pal, Arka, et al.
Published: (2025)
Neural Grammatical Error Correction for Romanian
by: Cotet, Teodor-Mihai, et al.
Published: (2026)
by: Cotet, Teodor-Mihai, et al.
Published: (2026)
Text Knows What, Tables Know When: Clinical Timeline Reconstruction via Retrieval-Augmented Multimodal Alignment
by: Kumar, Sayantan, et al.
Published: (2026)
by: Kumar, Sayantan, et al.
Published: (2026)
Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
Let the Code LLM Edit Itself When You Edit the Code
by: He, Zhenyu, et al.
Published: (2024)
by: He, Zhenyu, et al.
Published: (2024)
Large AI Model Empowered Multimodal Semantic Communications
by: Jiang, Feibo, et al.
Published: (2023)
by: Jiang, Feibo, et al.
Published: (2023)
Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection
by: Wei, Zhipeng, et al.
Published: (2024)
by: Wei, Zhipeng, et al.
Published: (2024)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
by: Becker, Evan, et al.
Published: (2024)
by: Becker, Evan, et al.
Published: (2024)
MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes
by: Abacha, Asma Ben, et al.
Published: (2024)
by: Abacha, Asma Ben, et al.
Published: (2024)
Assessing LLM Text Detection in Educational Contexts: Does Human Contribution Affect Detection?
by: Gehring, Lukas, et al.
Published: (2025)
by: Gehring, Lukas, et al.
Published: (2025)
Align-then-Unlearn: Embedding Alignment for LLM Unlearning
by: Spohn, Philipp, et al.
Published: (2025)
by: Spohn, Philipp, et al.
Published: (2025)
What Would You Ask When You First Saw $a^2+b^2=c^2$? Evaluating LLM on Curiosity-Driven Questioning
by: Javaji, Shashidhar Reddy, et al.
Published: (2024)
by: Javaji, Shashidhar Reddy, et al.
Published: (2024)
Forget What You Know about LLMs Evaluations -- LLMs are Like a Chameleon
by: Cohen-Inger, Nurit, et al.
Published: (2025)
by: Cohen-Inger, Nurit, et al.
Published: (2025)
MAGE: All-[MASK] Block Already Knows Where to Look in Diffusion LLM
by: Kwon, Omin, et al.
Published: (2026)
by: Kwon, Omin, et al.
Published: (2026)
When Models Know More Than They Say: Probing Analogical Reasoning in LLMs
by: McGovern, Hope, et al.
Published: (2026)
by: McGovern, Hope, et al.
Published: (2026)
Where Did It Go Wrong? Attributing Undesirable LLM Behaviors via Representation Gradient Tracing
by: Li, Zhe, et al.
Published: (2025)
by: Li, Zhe, et al.
Published: (2025)
To Believe or Not to Believe Your LLM
by: Yadkori, Yasin Abbasi, et al.
Published: (2024)
by: Yadkori, Yasin Abbasi, et al.
Published: (2024)
Similar Items
-
CaRT: Teaching LLM Agents to Know When They Know Enough
by: Liu, Grace, et al.
Published: (2025) -
Confidence Estimation for Error Detection in Text-to-SQL Systems
by: Somov, Oleg, et al.
Published: (2025) -
Your LLM Knows the Future: Uncovering Its Multi-Token Prediction Potential
by: Samragh, Mohammad, et al.
Published: (2025) -
Show Your Work with Confidence: Confidence Bands for Tuning Curves
by: Lourie, Nicholas, et al.
Published: (2023) -
Do Androids Know They're Only Dreaming of Electric Sheep?
by: CH-Wang, Sky, et al.
Published: (2023)