Saved in:
| Main Authors: | Zhou, Guancheng, Luo, Yisi, He, Zhengfu, Jin, Zhenyu, Ge, Xuyang, Shu, Wentao, Meng, Deyu, Qiu, Xipeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.17504 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Dimensional Collapse in Transformer Attention Outputs: A Challenge for Sparse Dictionary Learning
by: Wang, Junxuan, et al.
Published: (2025)
by: Wang, Junxuan, et al.
Published: (2025)
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
by: Wang, Junxuan, et al.
Published: (2024)
by: Wang, Junxuan, et al.
Published: (2024)
Tracing the Thought of a Grandmaster-level Chess-Playing Transformer
by: Lin, Rui, et al.
Published: (2026)
by: Lin, Rui, et al.
Published: (2026)
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT
by: He, Zhengfu, et al.
Published: (2024)
by: He, Zhengfu, et al.
Published: (2024)
Automatically Identifying Local and Global Circuits with Linear Computation Graphs
by: Ge, Xuyang, et al.
Published: (2024)
by: Ge, Xuyang, et al.
Published: (2024)
Evolution of Concepts in Language Model Pre-Training
by: Ge, Xuyang, et al.
Published: (2025)
by: Ge, Xuyang, et al.
Published: (2025)
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition
by: He, Zhengfu, et al.
Published: (2025)
by: He, Zhengfu, et al.
Published: (2025)
Compressive Imaging Reconstruction via Tensor Decomposed Multi-Resolution Grid Encoding
by: Jin, Zhenyu, et al.
Published: (2025)
by: Jin, Zhenyu, et al.
Published: (2025)
Revisiting Nonlocal Self-Similarity from Continuous Representation
by: Luo, Yisi, et al.
Published: (2024)
by: Luo, Yisi, et al.
Published: (2024)
Continuous Representation Methods, Theories, and Applications: An Overview and Perspectives
by: Luo, Yisi, et al.
Published: (2025)
by: Luo, Yisi, et al.
Published: (2025)
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
by: He, Zhengfu, et al.
Published: (2024)
by: He, Zhengfu, et al.
Published: (2024)
Unveiling the Mechanism of Continuous Representation Full-Waveform Inversion: A Wave Based Neural Tangent Kernel Framework
by: Chen, Ruihua, et al.
Published: (2026)
by: Chen, Ruihua, et al.
Published: (2026)
NeurTV: Total Variation on the Neural Domain
by: Luo, Yisi, et al.
Published: (2024)
by: Luo, Yisi, et al.
Published: (2024)
Cross-Frequency Implicit Neural Representation with Self-Evolving Parameters
by: Yu, Chang, et al.
Published: (2025)
by: Yu, Chang, et al.
Published: (2025)
Beyond Low-rankness: Guaranteed Matrix Recovery via Modified Nuclear Norm
by: Peng, Jiangjun, et al.
Published: (2025)
by: Peng, Jiangjun, et al.
Published: (2025)
Deciphering Neural Reparameterized Full-Waveform Inversion with Neural Sensitivity Kernel and Wave Tangent Kernel
by: Chen, Ruihua, et al.
Published: (2026)
by: Chen, Ruihua, et al.
Published: (2026)
Neural Approximation and Its Applications
by: Wu, Wei-Hao, et al.
Published: (2026)
by: Wu, Wei-Hao, et al.
Published: (2026)
Spatial Information Bottleneck for Interpretable Visual Recognition
by: Shu, Kaixiang, et al.
Published: (2025)
by: Shu, Kaixiang, et al.
Published: (2025)
Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination
by: Zhang, Dong-Xiao, et al.
Published: (2026)
by: Zhang, Dong-Xiao, et al.
Published: (2026)
Generate Point Clouds with Multiscale Details from Graph-Represented Structures
by: Yang, Ximing, et al.
Published: (2021)
by: Yang, Ximing, et al.
Published: (2021)
Can AI Assistants Know What They Don't Know?
by: Cheng, Qinyuan, et al.
Published: (2024)
by: Cheng, Qinyuan, et al.
Published: (2024)
Making Large Language Models Better Reasoners with Orchestrated Streaming Experiences
by: Liu, Xiangyang, et al.
Published: (2025)
by: Liu, Xiangyang, et al.
Published: (2025)
Faithful and Stable Neuron Explanations for Trustworthy Mechanistic Interpretability
by: Yan, Ge, et al.
Published: (2025)
by: Yan, Ge, et al.
Published: (2025)
Simultaneous Swap Regret Minimization via KL-Calibration
by: Luo, Haipeng, et al.
Published: (2025)
by: Luo, Haipeng, et al.
Published: (2025)
Principled Out-of-Distribution Generalization via Simplicity
by: Ge, Jiawei, et al.
Published: (2025)
by: Ge, Jiawei, et al.
Published: (2025)
Do Agents Think Deeper? A Mechanistic Investigation of Layer-Wise Dynamics in Sequential Planning
by: Cui, Zhenyu, et al.
Published: (2026)
by: Cui, Zhenyu, et al.
Published: (2026)
Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers
by: Cao, Jin, et al.
Published: (2024)
by: Cao, Jin, et al.
Published: (2024)
TRG-Net: An Interpretable and Controllable Rain Generator
by: Pang, Zhiqiang, et al.
Published: (2024)
by: Pang, Zhiqiang, et al.
Published: (2024)
GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation
by: Zong, Yi, et al.
Published: (2024)
by: Zong, Yi, et al.
Published: (2024)
A Novel Control Method of Sugar Boiling Based on Model‐Free Adaptive Control and Neural Networks
by: Guancheng Lu, et al.
Published: (2025)
by: Guancheng Lu, et al.
Published: (2025)
Soft-Evidence Fused Graph Neural Network for Cancer Driver Gene Identification across Multi-View Biological Graphs
by: Chen, Bang, et al.
Published: (2025)
by: Chen, Bang, et al.
Published: (2025)
A Unified Dual Consensus Approach to Distributed Optimization with Globally-Coupled Constraints
by: Liu, Zixuan, et al.
Published: (2025)
by: Liu, Zixuan, et al.
Published: (2025)
Dual Conic Proxy for Semidefinite Relaxation of AC Optimal Power Flow
by: Qiu, Guancheng, et al.
Published: (2025)
by: Qiu, Guancheng, et al.
Published: (2025)
Dual Conic Proxies for AC Optimal Power Flow
by: Qiu, Guancheng, et al.
Published: (2023)
by: Qiu, Guancheng, et al.
Published: (2023)
Knockdown of PRDX2 Inhibits the Proliferation, Growth, Migration, Invasion, and MMP9 Activity of Ewing's Sarcoma Cells Cultured In Vitro
by: Ruifeng Xue, et al.
Published: (2024)
by: Ruifeng Xue, et al.
Published: (2024)
Improving Memory Efficiency for Training KANs via Meta Learning
by: Zhao, Zhangchi, et al.
Published: (2025)
by: Zhao, Zhangchi, et al.
Published: (2025)
Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition
by: Zhou, Yubo, et al.
Published: (2026)
by: Zhou, Yubo, et al.
Published: (2026)
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws
by: Shu, Jun, et al.
Published: (2026)
by: Shu, Jun, et al.
Published: (2026)
How Attention Sinks Emerge in Large Language Models: An Interpretability Perspective
by: Peng, Runyu, et al.
Published: (2026)
by: Peng, Runyu, et al.
Published: (2026)
Interpretability in Parameter Space: Minimizing Mechanistic Description Length with Attribution-based Parameter Decomposition
by: Braun, Dan, et al.
Published: (2025)
by: Braun, Dan, et al.
Published: (2025)
Similar Items
-
Dimensional Collapse in Transformer Attention Outputs: A Challenge for Sparse Dictionary Learning
by: Wang, Junxuan, et al.
Published: (2025) -
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures
by: Wang, Junxuan, et al.
Published: (2024) -
Tracing the Thought of a Grandmaster-level Chess-Playing Transformer
by: Lin, Rui, et al.
Published: (2026) -
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT
by: He, Zhengfu, et al.
Published: (2024) -
Automatically Identifying Local and Global Circuits with Linear Computation Graphs
by: Ge, Xuyang, et al.
Published: (2024)