Saved in:
| Main Authors: | Pan, Wenbo, Liu, Zhichao, Wang, Xianlong, Yu, Haining, Jia, Xiaohua |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.01914 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
by: Pan, Wenbo, et al.
Published: (2025)
by: Pan, Wenbo, et al.
Published: (2025)
WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation
by: Liu, Zhichao, et al.
Published: (2026)
by: Liu, Zhichao, et al.
Published: (2026)
Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning
by: Yang, Zhicheng, et al.
Published: (2026)
by: Yang, Zhicheng, et al.
Published: (2026)
Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026)
by: Jia, Jinghan, et al.
Published: (2026)
Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning
by: Afolabi, Halimat, et al.
Published: (2026)
by: Afolabi, Halimat, et al.
Published: (2026)
TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
by: Zhang, Yuxiang, et al.
Published: (2025)
by: Zhang, Yuxiang, et al.
Published: (2025)
The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL
by: Li, Yingru, et al.
Published: (2026)
by: Li, Yingru, et al.
Published: (2026)
Faithful Interpretation for Graph Neural Networks
by: Hu, Lijie, et al.
Published: (2024)
by: Hu, Lijie, et al.
Published: (2024)
AttributionLab: Faithfulness of Feature Attribution Under Controllable Environments
by: Zhang, Yang, et al.
Published: (2023)
by: Zhang, Yang, et al.
Published: (2023)
MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)
by: Zhao, Guojiang, et al.
Published: (2025)
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)
by: Wu, Wei, et al.
Published: (2024)
Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models
by: Engel, Andrew, et al.
Published: (2023)
by: Engel, Andrew, et al.
Published: (2023)
LEGO: A Lightweight and Efficient Multiple-Attribute Unlearning Framework for Recommender Systems
by: Yu, Fengyuan, et al.
Published: (2025)
by: Yu, Fengyuan, et al.
Published: (2025)
TracLLM: A Generic Framework for Attributing Long Context LLMs
by: Wang, Yanting, et al.
Published: (2025)
by: Wang, Yanting, et al.
Published: (2025)
Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
by: Hong, Ilgee, et al.
Published: (2025)
by: Hong, Ilgee, et al.
Published: (2025)
Recursive Models for Long-Horizon Reasoning
by: Yang, Chenxiao, et al.
Published: (2026)
by: Yang, Chenxiao, et al.
Published: (2026)
Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning
by: Liu, Hanbing, et al.
Published: (2025)
by: Liu, Hanbing, et al.
Published: (2025)
Improving Interpretation Faithfulness for Vision Transformers
by: Hu, Lijie, et al.
Published: (2023)
by: Hu, Lijie, et al.
Published: (2023)
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)
Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL
by: Wu, Ian, et al.
Published: (2026)
by: Wu, Ian, et al.
Published: (2026)
LouisKV: Efficient KV Cache Retrieval for Long Input-Output Sequences
by: Wu, Wenbo, et al.
Published: (2025)
by: Wu, Wenbo, et al.
Published: (2025)
Ideal Attribution and Faithful Watermarks for Language Models
by: Song, Min Jae, et al.
Published: (2025)
by: Song, Min Jae, et al.
Published: (2025)
LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)
RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners
by: Gajjar, Jugal, et al.
Published: (2026)
by: Gajjar, Jugal, et al.
Published: (2026)
Long Input Sequence Network for Long Time Series Forecasting
by: Ma, Chao, et al.
Published: (2024)
by: Ma, Chao, et al.
Published: (2024)
Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
by: Oldfield, James, et al.
Published: (2025)
by: Oldfield, James, et al.
Published: (2025)
FaithLM: Towards Faithful Explanations for Large Language Models
by: Chuang, Yu-Neng, et al.
Published: (2024)
by: Chuang, Yu-Neng, et al.
Published: (2024)
DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning
by: Zarch, Hossein Entezari, et al.
Published: (2025)
by: Zarch, Hossein Entezari, et al.
Published: (2025)
Incorporating Attribution Importance for Improving Faithfulness Metrics
by: Zhao, Zhixue, et al.
Published: (2023)
by: Zhao, Zhixue, et al.
Published: (2023)
Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach
by: Deng, Guilin, et al.
Published: (2026)
by: Deng, Guilin, et al.
Published: (2026)
Do Contemporary Causal Inference Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark
by: Yu, Haining, et al.
Published: (2024)
by: Yu, Haining, et al.
Published: (2024)
Toward a Theory of Tokenization in LLMs
by: Rajaraman, Nived, et al.
Published: (2024)
by: Rajaraman, Nived, et al.
Published: (2024)
Planning Transformer: Long-Horizon Offline Reinforcement Learning with Planning Tokens
by: Clinton, Joseph, et al.
Published: (2024)
by: Clinton, Joseph, et al.
Published: (2024)
CRAFT: Calibrated Reasoning with Answer-Faithful Traces via Reinforcement Learning for Multi-Hop Question Answering
by: Liu, Yu, et al.
Published: (2026)
by: Liu, Yu, et al.
Published: (2026)
Towards Interpretable and Trustworthy Time Series Reasoning: A BlueSky Vision
by: Ning, Kanghui, et al.
Published: (2025)
by: Ning, Kanghui, et al.
Published: (2025)
How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines
by: Deng, Junwei, et al.
Published: (2026)
by: Deng, Junwei, et al.
Published: (2026)
TokenShapley: Token Level Context Attribution with Shapley Value
by: Xiao, Yingtai, et al.
Published: (2025)
by: Xiao, Yingtai, et al.
Published: (2025)
Beyond Output Faithfulness: Learning Attributions that Preserve Computational Pathways
by: Zhang, Siyu, et al.
Published: (2025)
by: Zhang, Siyu, et al.
Published: (2025)
Faithful and Robust Local Interpretability for Textual Predictions
by: Lopardo, Gianluigi, et al.
Published: (2023)
by: Lopardo, Gianluigi, et al.
Published: (2023)
Adaptive Discovery of Interpretable Audio Attributes with Multimodal LLMs for Low-Resource Classification
by: Yoshimura, Kosuke, et al.
Published: (2026)
by: Yoshimura, Kosuke, et al.
Published: (2026)
Similar Items
-
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
by: Pan, Wenbo, et al.
Published: (2025) -
WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation
by: Liu, Zhichao, et al.
Published: (2026) -
Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning
by: Yang, Zhicheng, et al.
Published: (2026) -
Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026) -
Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning
by: Afolabi, Halimat, et al.
Published: (2026)