:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pan, Wenbo, Liu, Zhichao, Wang, Xianlong, Yu, Haining, Jia, Xiaohua
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.01914
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
by: Pan, Wenbo, et al.
Published: (2025)

WebTrap: Stealthy Mid-Task Hijacking of Browser Agents During Navigation
by: Liu, Zhichao, et al.
Published: (2026)

Prune-OPD: Efficient and Reliable On-Policy Distillation for Long-Horizon Reasoning
by: Yang, Zhicheng, et al.
Published: (2026)

Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026)

Faithful or Just Plausible? Evaluating the Faithfulness of Closed-Source LLMs in Medical Reasoning
by: Afolabi, Halimat, et al.
Published: (2026)

TokenSqueeze: Performance-Preserving Compression for Reasoning LLMs
by: Zhang, Yuxiang, et al.
Published: (2025)

The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL
by: Li, Yingru, et al.
Published: (2026)

Faithful Interpretation for Graph Neural Networks
by: Hu, Lijie, et al.
Published: (2024)

AttributionLab: Faithfulness of Feature Attribution Under Controllable Environments
by: Zhang, Yang, et al.
Published: (2023)

MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)

TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)

Faithful and Efficient Explanations for Neural Networks via Neural Tangent Kernel Surrogate Models
by: Engel, Andrew, et al.
Published: (2023)

LEGO: A Lightweight and Efficient Multiple-Attribute Unlearning Framework for Recommender Systems
by: Yu, Fengyuan, et al.
Published: (2025)

TracLLM: A Generic Framework for Attributing Long Context LLMs
by: Wang, Yanting, et al.
Published: (2025)

Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models
by: Hong, Ilgee, et al.
Published: (2025)

Recursive Models for Long-Horizon Reasoning
by: Yang, Chenxiao, et al.
Published: (2026)

Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning
by: Liu, Hanbing, et al.
Published: (2025)

Improving Interpretation Faithfulness for Vision Transformers
by: Hu, Lijie, et al.
Published: (2023)

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL
by: Wu, Ian, et al.
Published: (2026)

LouisKV: Efficient KV Cache Retrieval for Long Input-Output Sequences
by: Wu, Wenbo, et al.
Published: (2025)

Ideal Attribution and Faithful Watermarks for Language Models
by: Song, Min Jae, et al.
Published: (2025)

LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)

RSAT: Structured Attribution Makes Small Language Models Faithful Table Reasoners
by: Gajjar, Jugal, et al.
Published: (2026)

Long Input Sequence Network for Long Time Series Forecasting
by: Ma, Chao, et al.
Published: (2024)

Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders
by: Oldfield, James, et al.
Published: (2025)

FaithLM: Towards Faithful Explanations for Large Language Models
by: Chuang, Yu-Neng, et al.
Published: (2024)

DELTA: Dynamic Layer-Aware Token Attention for Efficient Long-Context Reasoning
by: Zarch, Hossein Entezari, et al.
Published: (2025)

Incorporating Attribution Importance for Improving Faithfulness Metrics
by: Zhao, Zhixue, et al.
Published: (2023)

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach
by: Deng, Guilin, et al.
Published: (2026)

Do Contemporary Causal Inference Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark
by: Yu, Haining, et al.
Published: (2024)

Toward a Theory of Tokenization in LLMs
by: Rajaraman, Nived, et al.
Published: (2024)

Planning Transformer: Long-Horizon Offline Reinforcement Learning with Planning Tokens
by: Clinton, Joseph, et al.
Published: (2024)

CRAFT: Calibrated Reasoning with Answer-Faithful Traces via Reinforcement Learning for Multi-Hop Question Answering
by: Liu, Yu, et al.
Published: (2026)

Towards Interpretable and Trustworthy Time Series Reasoning: A BlueSky Vision
by: Ning, Kanghui, et al.
Published: (2025)

How Faithful Is Trajectory-Based Data Attribution? Error Sources, Remedies, and Practical Guidelines
by: Deng, Junwei, et al.
Published: (2026)

TokenShapley: Token Level Context Attribution with Shapley Value
by: Xiao, Yingtai, et al.
Published: (2025)

Beyond Output Faithfulness: Learning Attributions that Preserve Computational Pathways
by: Zhang, Siyu, et al.
Published: (2025)

Faithful and Robust Local Interpretability for Textual Predictions
by: Lopardo, Gianluigi, et al.
Published: (2023)

Adaptive Discovery of Interpretable Audio Attributes with Multimodal LLMs for Low-Resource Classification
by: Yoshimura, Kosuke, et al.
Published: (2026)