Saved in:
| Main Authors: | Ma, Huan, Chen, Jingdong, Zhou, Joey Tianyi, Wang, Guangyu, Zhang, Changqing |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.00290 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
by: Zhou, Tianyi, et al.
Published: (2025)
by: Zhou, Tianyi, et al.
Published: (2025)
Identifying and Mitigating Social Bias Knowledge in Language Models
by: Chen, Ruizhe, et al.
Published: (2024)
by: Chen, Ruizhe, et al.
Published: (2024)
Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation
by: Cui, Huizi, et al.
Published: (2026)
by: Cui, Huizi, et al.
Published: (2026)
Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents
by: Ma, Tianmi, et al.
Published: (2025)
by: Ma, Tianmi, et al.
Published: (2025)
PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations
by: Wu, Yuhe, et al.
Published: (2026)
by: Wu, Yuhe, et al.
Published: (2026)
TS-Reasoner: Aligning Time Series Foundation Models with LLM Reasoning
by: Yu, Fangxu, et al.
Published: (2025)
by: Yu, Fangxu, et al.
Published: (2025)
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
by: Zhang, Tunyu, et al.
Published: (2025)
by: Zhang, Tunyu, et al.
Published: (2025)
Improve LLM-based Automatic Essay Scoring with Linguistic Features
by: Hou, Zhaoyi Joey, et al.
Published: (2025)
by: Hou, Zhaoyi Joey, et al.
Published: (2025)
Evaluating the Relevance of Uncertainty Estimators for LLM Hallucination
by: Agnimo, Yedidia, et al.
Published: (2026)
by: Agnimo, Yedidia, et al.
Published: (2026)
Multiple LLM Agents Debate for Equitable Cultural Alignment
by: Ki, Dayeon, et al.
Published: (2025)
by: Ki, Dayeon, et al.
Published: (2025)
From LLM-anation to LLM-orchestrator: Coordinating Small Models for Data Labeling
by: Lu, Yao, et al.
Published: (2025)
by: Lu, Yao, et al.
Published: (2025)
Estimating the Error of Large Language Models at Pairwise Text Comparison
by: Li, Tianyi
Published: (2025)
by: Li, Tianyi
Published: (2025)
Language Models Resist Alignment: Evidence From Data Compression
by: Ji, Jiaming, et al.
Published: (2024)
by: Ji, Jiaming, et al.
Published: (2024)
Uncertainty Estimation of Large Language Models in Medical Question Answering
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
Label-Confidence-Aware Uncertainty Estimation in Natural Language Generation
by: Lin, Qinhong, et al.
Published: (2024)
by: Lin, Qinhong, et al.
Published: (2024)
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction
by: Li, Ming, et al.
Published: (2025)
by: Li, Ming, et al.
Published: (2025)
RuleR: Improving LLM Controllability by Rule-based Data Recycling
by: Li, Ming, et al.
Published: (2024)
by: Li, Ming, et al.
Published: (2024)
Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection
by: Niu, Tianyi, et al.
Published: (2026)
by: Niu, Tianyi, et al.
Published: (2026)
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
by: Xie, Tian, et al.
Published: (2025)
by: Xie, Tian, et al.
Published: (2025)
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
by: Li, Ming, et al.
Published: (2024)
by: Li, Ming, et al.
Published: (2024)
Towards Real-Time Fake News Detection under Evidence Scarcity
by: Wei, Guangyu, et al.
Published: (2025)
by: Wei, Guangyu, et al.
Published: (2025)
Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning
by: Wu, Tianyi, et al.
Published: (2025)
by: Wu, Tianyi, et al.
Published: (2025)
Where to show Demos in Your Prompt: A Positional Bias of In-Context Learning
by: Cobbina, Kwesi, et al.
Published: (2025)
by: Cobbina, Kwesi, et al.
Published: (2025)
Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
by: Sun, Yifan, et al.
Published: (2025)
by: Sun, Yifan, et al.
Published: (2025)
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach
by: Tan, Zhen, et al.
Published: (2024)
by: Tan, Zhen, et al.
Published: (2024)
LLM Inference Unveiled: Survey and Roofline Model Insights
by: Yuan, Zhihang, et al.
Published: (2024)
by: Yuan, Zhihang, et al.
Published: (2024)
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
by: Li, Xirui, et al.
Published: (2024)
by: Li, Xirui, et al.
Published: (2024)
Unlocking the Power of LLM Uncertainty for Active In-Context Example Selection
by: Huang, Hsiu-Yuan, et al.
Published: (2024)
by: Huang, Hsiu-Yuan, et al.
Published: (2024)
AdaThink-Med: Medical Adaptive Thinking with Uncertainty-Guided Length Calibration
by: Rui, Shaohao, et al.
Published: (2025)
by: Rui, Shaohao, et al.
Published: (2025)
EviLink: Multi-Path Schema Linking with Uncertainty-Guided Evidence Acquisition for Large-Scale Text-to-SQL
by: Zheng, Huawei, et al.
Published: (2026)
by: Zheng, Huawei, et al.
Published: (2026)
Structured Uncertainty guided Clarification for LLM Agents
by: Suri, Manan, et al.
Published: (2025)
by: Suri, Manan, et al.
Published: (2025)
From Documents to Spans: Scalable Supervision for Evidence-Based ICD Coding with LLMs
by: Zhang, Xu, et al.
Published: (2026)
by: Zhang, Xu, et al.
Published: (2026)
Entropic Claim Resolution: Uncertainty-Driven Evidence Selection for RAG
by: Di Gioia, Davide
Published: (2026)
by: Di Gioia, Davide
Published: (2026)
Optimizing Length Compression in Large Reasoning Models
by: Cheng, Zhengxiang, et al.
Published: (2025)
by: Cheng, Zhengxiang, et al.
Published: (2025)
SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
by: Sun, Ryan, et al.
Published: (2024)
by: Sun, Ryan, et al.
Published: (2024)
DOTA: Distributional Test-Time Adaptation of Vision-Language Models
by: Han, Zongbo, et al.
Published: (2024)
by: Han, Zongbo, et al.
Published: (2024)
Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL
by: Hong, Joey, et al.
Published: (2025)
by: Hong, Joey, et al.
Published: (2025)
Uncertainty Estimation for the Open-Set Text Classification systems
by: Erlygin, Leonid, et al.
Published: (2026)
by: Erlygin, Leonid, et al.
Published: (2026)
From Scores to Steps: Diagnosing and Improving LLM Performance in Evidence-Based Medical Calculations
by: Wang, Benlu, et al.
Published: (2025)
by: Wang, Benlu, et al.
Published: (2025)
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
by: Shen, Han, et al.
Published: (2024)
by: Shen, Han, et al.
Published: (2024)
Similar Items
-
Can LLMs Detect Their Confabulations? Estimating Reliability in Uncertainty-Aware Language Models
by: Zhou, Tianyi, et al.
Published: (2025) -
Identifying and Mitigating Social Bias Knowledge in Language Models
by: Chen, Ruizhe, et al.
Published: (2024) -
Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation
by: Cui, Huizi, et al.
Published: (2026) -
Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents
by: Ma, Tianmi, et al.
Published: (2025) -
PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations
by: Wu, Yuhe, et al.
Published: (2026)