Saved in:
| Main Authors: | Chen, Zixuan, Lin, Hao, Chen, Zizhe, Tian, Yizhou, Yang, Garry, Wang, Depeng, Guo, Ya, Zhu, Huijia, Cheng, James |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.05957 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning
by: Chen, Zizhe, et al.
Published: (2026)
by: Chen, Zizhe, et al.
Published: (2026)
EchoingPixels: Cross-Modal Adaptive Token Reduction for Efficient Audio-Visual LLMs
by: Gong, Chao, et al.
Published: (2025)
by: Gong, Chao, et al.
Published: (2025)
AVID: A Benchmark for Omni-Modal Audio-Visual Inconsistency Understanding via Agent-Driven Construction
by: Chen, Zixuan, et al.
Published: (2026)
by: Chen, Zixuan, et al.
Published: (2026)
Reinforcement Learning from Denoising Feedback
by: He, Qi, et al.
Published: (2026)
by: He, Qi, et al.
Published: (2026)
RW-TTT: Batched Serving for Request-Owned Test-Time Training State
by: Yang, Jian, et al.
Published: (2026)
by: Yang, Jian, et al.
Published: (2026)
CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
by: Lin, Zicheng, et al.
Published: (2024)
by: Lin, Zicheng, et al.
Published: (2024)
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
by: Ren, Baochang, et al.
Published: (2025)
by: Ren, Baochang, et al.
Published: (2025)
Low-Rank Correction for Quantized LLMs
by: Scetbon, Meyer, et al.
Published: (2024)
by: Scetbon, Meyer, et al.
Published: (2024)
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
by: Mao, Yixiu, et al.
Published: (2024)
by: Mao, Yixiu, et al.
Published: (2024)
RAC: Efficient LLM Factuality Correction with Retrieval Augmentation
by: Li, Changmao, et al.
Published: (2024)
by: Li, Changmao, et al.
Published: (2024)
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction
by: Li, Zixuan, et al.
Published: (2024)
by: Li, Zixuan, et al.
Published: (2024)
Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction
by: Lu, Kangkang, et al.
Published: (2024)
by: Lu, Kangkang, et al.
Published: (2024)
Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks
by: Pan, Wenbo, et al.
Published: (2025)
by: Pan, Wenbo, et al.
Published: (2025)
Beyond Output Correctness: Benchmarking and Evaluating Large Language Model Reasoning in Coding Tasks
by: Li, Yuangang, et al.
Published: (2026)
by: Li, Yuangang, et al.
Published: (2026)
Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall
by: Yuan, Jiaqing, et al.
Published: (2024)
by: Yuan, Jiaqing, et al.
Published: (2024)
Mechanistic Interpretability of Code Correctness in LLMs via Sparse Autoencoders
by: Tahimic, Kriz, et al.
Published: (2025)
by: Tahimic, Kriz, et al.
Published: (2025)
MESH -- Understanding Videos Like Human: Measuring Hallucinations in Large Video Models
by: Yang, Garry, et al.
Published: (2025)
by: Yang, Garry, et al.
Published: (2025)
Efficient Multi-Task Reinforcement Learning via Task-Specific Action Correction
by: Feng, Jinyuan, et al.
Published: (2024)
by: Feng, Jinyuan, et al.
Published: (2024)
Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression
by: Bareeva, Dilyara, et al.
Published: (2024)
by: Bareeva, Dilyara, et al.
Published: (2024)
ORGEval: Graph-Theoretic Evaluation of LLMs in Optimization Modeling
by: Wang, Zhuohan, et al.
Published: (2025)
by: Wang, Zhuohan, et al.
Published: (2025)
Teaching LLMs for Step-Level Automatic Math Correction via Reinforcement Learning
by: Li, Junsong, et al.
Published: (2025)
by: Li, Junsong, et al.
Published: (2025)
MathlibPR: Pull Request Merge-Readiness Benchmark for Formal Mathematical Libraries
by: Xie, Zixuan, et al.
Published: (2026)
by: Xie, Zixuan, et al.
Published: (2026)
Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights
by: Chen, Yi, et al.
Published: (2026)
by: Chen, Yi, et al.
Published: (2026)
Hide in Plain Sight: Clean-Label Backdoor for Auditing Membership Inference
by: Chen, Depeng, et al.
Published: (2024)
by: Chen, Depeng, et al.
Published: (2024)
Learning to Correct for QA Reasoning with Black-box LLMs
by: Kim, Jaehyung, et al.
Published: (2024)
by: Kim, Jaehyung, et al.
Published: (2024)
Know When You're Wrong: Aligning Confidence with Correctness for LLM Error Detection
by: Xiaohu, Xie, et al.
Published: (2026)
by: Xiaohu, Xie, et al.
Published: (2026)
KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
by: Zhang, Jiawei, et al.
Published: (2024)
by: Zhang, Jiawei, et al.
Published: (2024)
Persuasion Tokens for Editing Factual Knowledge in LLMs
by: Youssef, Paul, et al.
Published: (2026)
by: Youssef, Paul, et al.
Published: (2026)
InsightVision: A Comprehensive, Multi-Level Chinese-based Benchmark for Evaluating Implicit Visual Semantics in Large Vision Language Models
by: Yin, Xiaofei, et al.
Published: (2025)
by: Yin, Xiaofei, et al.
Published: (2025)
Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters
by: Sanchez, Bryan
Published: (2026)
by: Sanchez, Bryan
Published: (2026)
On LLMs' Internal Representation of Code Correctness
by: Ribeiro, Francisco, et al.
Published: (2025)
by: Ribeiro, Francisco, et al.
Published: (2025)
Evaluating the Correctness of Inference Patterns Used by LLMs for Judgment
by: Chen, Lu, et al.
Published: (2024)
by: Chen, Lu, et al.
Published: (2024)
When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning
by: Hao, Chenjie, et al.
Published: (2026)
by: Hao, Chenjie, et al.
Published: (2026)
Unforgeable Watermarks for Language Models via Robust Signatures
by: Lin, Huijia, et al.
Published: (2026)
by: Lin, Huijia, et al.
Published: (2026)
KnowCoder-X: Boosting Multilingual Information Extraction via Code
by: Zuo, Yuxin, et al.
Published: (2024)
by: Zuo, Yuxin, et al.
Published: (2024)
Partial Domain Adaptation via Importance Sampling-based Shift Correction
by: Guo, Cheng-Jun, et al.
Published: (2025)
by: Guo, Cheng-Jun, et al.
Published: (2025)
Causality-Inspired Safe Residual Correction for Multivariate Time Series
by: Xie, Jianxiang, et al.
Published: (2025)
by: Xie, Jianxiang, et al.
Published: (2025)
Adaptive Requesting in Decentralized Edge Networks via Non-Stationary Bandits
by: Zhuang, Yi, et al.
Published: (2026)
by: Zhuang, Yi, et al.
Published: (2026)
Attention Head Entropy of LLMs Predicts Answer Correctness
by: Ostmeier, Sophie, et al.
Published: (2026)
by: Ostmeier, Sophie, et al.
Published: (2026)
Dynamically Anchored Prompting for Task-Imbalanced Continual Learning
by: Hong, Chenxing, et al.
Published: (2024)
by: Hong, Chenxing, et al.
Published: (2024)
Similar Items
-
Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning
by: Chen, Zizhe, et al.
Published: (2026) -
EchoingPixels: Cross-Modal Adaptive Token Reduction for Efficient Audio-Visual LLMs
by: Gong, Chao, et al.
Published: (2025) -
AVID: A Benchmark for Omni-Modal Audio-Visual Inconsistency Understanding via Agent-Driven Construction
by: Chen, Zixuan, et al.
Published: (2026) -
Reinforcement Learning from Denoising Feedback
by: He, Qi, et al.
Published: (2026) -
RW-TTT: Batched Serving for Request-Owned Test-Time Training State
by: Yang, Jian, et al.
Published: (2026)