Saved in:
| Main Authors: | Shen, Chenglei, Sun, Zhongxiang, Shi, Teng, Zhang, Xiao, Xu, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.04530 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
by: Sun, Zhongxiang, et al.
Published: (2026)
by: Sun, Zhongxiang, et al.
Published: (2026)
On the Decision-Making Abilities in Role-Playing using Large Language Models
by: Shen, Chenglei, et al.
Published: (2024)
by: Shen, Chenglei, et al.
Published: (2024)
LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation
by: Shi, Teng, et al.
Published: (2025)
by: Shi, Teng, et al.
Published: (2025)
Effective In-Context Example Selection through Data Compression
by: Sun, Zhongxiang, et al.
Published: (2024)
by: Sun, Zhongxiang, et al.
Published: (2024)
SteerX: Disentangled Steering for LLM Personalization
by: Zhao, Xiaoyan, et al.
Published: (2025)
by: Zhao, Xiaoyan, et al.
Published: (2025)
MAPS: Motivation-Aware Personalized Search via LLM-Driven Consultation Alignment
by: Qin, Weicong, et al.
Published: (2025)
by: Qin, Weicong, et al.
Published: (2025)
Trigger$^3$: Refining Query Correction via Adaptive Model Selector
by: Zhang, Kepu, et al.
Published: (2024)
by: Zhang, Kepu, et al.
Published: (2024)
Detection and Mitigation of Hallucination in Large Reasoning Models: A Mechanistic Perspective
by: Sun, Zhongxiang, et al.
Published: (2025)
by: Sun, Zhongxiang, et al.
Published: (2025)
An Explicit Syllogistic Legal Reasoning Framework for Large Language Models
by: Zhang, Kepu, et al.
Published: (2025)
by: Zhang, Kepu, et al.
Published: (2025)
Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong
by: Si, Chenglei, et al.
Published: (2023)
by: Si, Chenglei, et al.
Published: (2023)
DRIFT: Detecting Representational Inconsistencies for Factual Truthfulness
by: Bhatnagar, Rohan, et al.
Published: (2026)
by: Bhatnagar, Rohan, et al.
Published: (2026)
MoRE: A Mixture of Reflectors Framework for Large Language Model-Based Sequential Recommendation
by: Qin, Weicong, et al.
Published: (2024)
by: Qin, Weicong, et al.
Published: (2024)
Towards Understanding Continual Factual Knowledge Acquisition of Language Models: From Theory to Algorithm
by: Wang, Haoyu, et al.
Published: (2026)
by: Wang, Haoyu, et al.
Published: (2026)
Interpretable LLM Guardrails via Sparse Representation Steering
by: He, Zeqing, et al.
Published: (2025)
by: He, Zeqing, et al.
Published: (2025)
LargePiG: Your Large Language Model is Secretly a Pointer Generator
by: Sun, Zhongxiang, et al.
Published: (2024)
by: Sun, Zhongxiang, et al.
Published: (2024)
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
by: Sun, Zhongxiang, et al.
Published: (2024)
by: Sun, Zhongxiang, et al.
Published: (2024)
Logic Rules as Explanations for Legal Case Retrieval
by: Sun, Zhongxiang, et al.
Published: (2024)
by: Sun, Zhongxiang, et al.
Published: (2024)
TruthFlow: Truthful LLM Generation via Representation Flow Correction
by: Wang, Hanyu, et al.
Published: (2025)
by: Wang, Hanyu, et al.
Published: (2025)
Exploring the Nexus of Large Language Models and Legal Systems: A Short Survey
by: Qin, Weicong, et al.
Published: (2024)
by: Qin, Weicong, et al.
Published: (2024)
Discern Truth from Falsehood: Reducing Over-Refusal via Contrastive Refinement
by: Lu, Yuxiao, et al.
Published: (2026)
by: Lu, Yuxiao, et al.
Published: (2026)
PrLM: Learning Explicit Reasoning for Personalized RAG via Contrastive Reward Optimization
by: Zhang, Kepu, et al.
Published: (2025)
by: Zhang, Kepu, et al.
Published: (2025)
Disentangled VAD Representations via a Variational Framework for Political Stance Detection
by: Xu, Beiyu, et al.
Published: (2025)
by: Xu, Beiyu, et al.
Published: (2025)
Deep Search with Hierarchical Meta-Cognitive Monitoring Inspired by Cognitive Neuroscience
by: Sun, Zhongxiang, et al.
Published: (2026)
by: Sun, Zhongxiang, et al.
Published: (2026)
Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning
by: Wu, Tianyi, et al.
Published: (2025)
by: Wu, Tianyi, et al.
Published: (2025)
SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models
by: He, Zirui, et al.
Published: (2025)
by: He, Zirui, et al.
Published: (2025)
ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding
by: Sun, Zhongxiang, et al.
Published: (2025)
by: Sun, Zhongxiang, et al.
Published: (2025)
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
by: Wei, Zhepei, et al.
Published: (2025)
by: Wei, Zhepei, et al.
Published: (2025)
CoT is Not the Chain of Truth: An Empirical Internal Analysis of Reasoning LLMs for Fake News Generation
by: Tong, Zhao, et al.
Published: (2026)
by: Tong, Zhao, et al.
Published: (2026)
Interpretable Discriminative Text Representations via Agreement and Label Disentanglement
by: Wang, Tong, et al.
Published: (2026)
by: Wang, Tong, et al.
Published: (2026)
Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories
by: Wang, Tianlong, et al.
Published: (2024)
by: Wang, Tianlong, et al.
Published: (2024)
The Cylindrical Representation Hypothesis for Language Model Steering
by: Gao, Lang, et al.
Published: (2026)
by: Gao, Lang, et al.
Published: (2026)
Improved Representation Steering for Language Models
by: Wu, Zhengxuan, et al.
Published: (2025)
by: Wu, Zhengxuan, et al.
Published: (2025)
Training-free Truthfulness Detection via Value Vectors in LLMs
by: Liu, Runheng, et al.
Published: (2025)
by: Liu, Runheng, et al.
Published: (2025)
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
by: Qi, Tianhao, et al.
Published: (2024)
by: Qi, Tianhao, et al.
Published: (2024)
Representational and Behavioral Stability of Truth in Large Language Models
by: Dies, Samantha, et al.
Published: (2025)
by: Dies, Samantha, et al.
Published: (2025)
In-Distribution Steering: Balancing Control and Coherence in Language Model Generation
by: Vogels, Arthur, et al.
Published: (2025)
by: Vogels, Arthur, et al.
Published: (2025)
How Context Shapes Truth: Geometric Transformations of Statement-level Truth Representations in LLMs
by: Adarsh, Shivam, et al.
Published: (2026)
by: Adarsh, Shivam, et al.
Published: (2026)
Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection
by: Pu, Xiao, et al.
Published: (2026)
by: Pu, Xiao, et al.
Published: (2026)
LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering
by: Bi, Jinhe, et al.
Published: (2024)
by: Bi, Jinhe, et al.
Published: (2024)
SteerRM: Debiasing Reward Models via Sparse Autoencoders
by: Sun, Mengyuan, et al.
Published: (2026)
by: Sun, Mengyuan, et al.
Published: (2026)
Similar Items
-
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
by: Sun, Zhongxiang, et al.
Published: (2026) -
On the Decision-Making Abilities in Role-Playing using Large Language Models
by: Shen, Chenglei, et al.
Published: (2024) -
LLaDA-Rec: Discrete Diffusion for Parallel Semantic ID Generation in Generative Recommendation
by: Shi, Teng, et al.
Published: (2025) -
Effective In-Context Example Selection through Data Compression
by: Sun, Zhongxiang, et al.
Published: (2024) -
SteerX: Disentangled Steering for LLM Personalization
by: Zhao, Xiaoyan, et al.
Published: (2025)