Saved in:
| Main Authors: | Lee, Andrew, Viégas, Fernanda, Wattenberg, Martin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.09967 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Shared Global and Local Geometry of Language Model Embeddings
by: Lee, Andrew, et al.
Published: (2025)
by: Lee, Andrew, et al.
Published: (2025)
Decomposing Query-Key Feature Interactions Using Contrastive Covariances
by: Lee, Andrew, et al.
Published: (2026)
by: Lee, Andrew, et al.
Published: (2026)
Relational Composition in Neural Networks: A Survey and Call to Action
by: Wattenberg, Martin, et al.
Published: (2024)
by: Wattenberg, Martin, et al.
Published: (2024)
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
by: Li, Kenneth, et al.
Published: (2024)
by: Li, Kenneth, et al.
Published: (2024)
The Geometry of Self-Verification in a Task-Specific Reasoning Model
by: Lee, Andrew, et al.
Published: (2025)
by: Lee, Andrew, et al.
Published: (2025)
Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls
by: Bai, Xiaoyan, et al.
Published: (2025)
by: Bai, Xiaoyan, et al.
Published: (2025)
When Bad Data Leads to Good Models
by: Li, Kenneth, et al.
Published: (2025)
by: Li, Kenneth, et al.
Published: (2025)
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
by: Li, Kenneth, et al.
Published: (2022)
by: Li, Kenneth, et al.
Published: (2022)
Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
by: Li, Kenneth, et al.
Published: (2023)
by: Li, Kenneth, et al.
Published: (2023)
Chronotome: Real-Time Topic Modeling for Streaming Embedding Spaces
by: Lim, Matte, et al.
Published: (2025)
by: Lim, Matte, et al.
Published: (2025)
Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
by: Li, Kenneth, et al.
Published: (2024)
by: Li, Kenneth, et al.
Published: (2024)
Story Ribbons: Reimagining Storyline Visualizations with Large Language Models
by: Yeh, Catherine, et al.
Published: (2025)
by: Yeh, Catherine, et al.
Published: (2025)
What Does it Mean for a Neural Network to Learn a "World Model"?
by: Li, Kenneth, et al.
Published: (2025)
by: Li, Kenneth, et al.
Published: (2025)
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
by: Li, Kenneth, et al.
Published: (2024)
by: Li, Kenneth, et al.
Published: (2024)
ICLR: In-Context Learning of Representations
by: Park, Core Francisco, et al.
Published: (2024)
by: Park, Core Francisco, et al.
Published: (2024)
Attention-based Iterative Decomposition for Tensor Product Representation
by: Park, Taewon, et al.
Published: (2024)
by: Park, Taewon, et al.
Published: (2024)
Optimal Representation Size: High-Dimensional Analysis of Pretraining and Linear Probing
by: Njaradi, Valentina, et al.
Published: (2026)
by: Njaradi, Valentina, et al.
Published: (2026)
Learning Shared Representations for Multi-Task Linear Bandits
by: Lin, Jiabin, et al.
Published: (2026)
by: Lin, Jiabin, et al.
Published: (2026)
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior
by: Wurgaft, Daniel, et al.
Published: (2026)
by: Wurgaft, Daniel, et al.
Published: (2026)
Graph Edge Representation via Tensor Product Graph Convolutional Representation
by: Jiang, Bo, et al.
Published: (2024)
by: Jiang, Bo, et al.
Published: (2024)
Does visualization help AI understand data?
by: Li, Victoria R., et al.
Published: (2025)
by: Li, Victoria R., et al.
Published: (2025)
Semantic Convergence: Investigating Shared Representations Across Scaled LLMs
by: Son, Daniel, et al.
Published: (2025)
by: Son, Daniel, et al.
Published: (2025)
On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations
by: Xiong, Guojun, et al.
Published: (2024)
by: Xiong, Guojun, et al.
Published: (2024)
Towards Sampling Data Structures for Tensor Products in Turnstile Streams
by: Song, Zhao, et al.
Published: (2025)
by: Song, Zhao, et al.
Published: (2025)
Representation Alignment Rests on Linear Structure
by: Bangachev, Kiril, et al.
Published: (2026)
by: Bangachev, Kiril, et al.
Published: (2026)
Contrastive Learning of Shared Spatiotemporal EEG Representations Across Individuals for Naturalistic Neuroscience
by: Shen, Xinke, et al.
Published: (2024)
by: Shen, Xinke, et al.
Published: (2024)
Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories
by: Xu, Yongzhong
Published: (2026)
by: Xu, Yongzhong
Published: (2026)
Tensor Completion with Nearly Linear Samples Given Weak Side Information
by: Yu, Christina Lee, et al.
Published: (2020)
by: Yu, Christina Lee, et al.
Published: (2020)
Priors in Time: Missing Inductive Biases for Language Model Interpretability
by: Lubana, Ekdeep Singh, et al.
Published: (2025)
by: Lubana, Ekdeep Singh, et al.
Published: (2025)
Rhetorical Questions in LLM Representations: A Linear Probing Study
by: Yao, Louie Hong, et al.
Published: (2026)
by: Yao, Louie Hong, et al.
Published: (2026)
E2Former: An Efficient and Equivariant Transformer with Linear-Scaling Tensor Products
by: Li, Yunyang, et al.
Published: (2025)
by: Li, Yunyang, et al.
Published: (2025)
Fully Distributed, Flexible Compositional Visual Representations via Soft Tensor Products
by: Sun, Bethia, et al.
Published: (2024)
by: Sun, Bethia, et al.
Published: (2024)
A Single Direction of Truth: An Observer Model's Linear Residual Probe Exposes and Steers Contextual Hallucinations
by: O'Neill, Charles, et al.
Published: (2025)
by: O'Neill, Charles, et al.
Published: (2025)
Dynamics Reveals Structure: Challenging the Linear Propagation Assumption
by: Chang, Hoyeon, et al.
Published: (2026)
by: Chang, Hoyeon, et al.
Published: (2026)
Learning Representations for Reasoning: Generalizing Across Diverse Structures
by: Zhu, Zhaocheng
Published: (2024)
by: Zhu, Zhaocheng
Published: (2024)
Structured Unitary Tensor Network Representations for Circuit-Efficient Quantum Data Encoding
by: Lin, Guang, et al.
Published: (2026)
by: Lin, Guang, et al.
Published: (2026)
Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation
by: Ding, Shihong, et al.
Published: (2026)
by: Ding, Shihong, et al.
Published: (2026)
Learning Fine-grained Parameter Sharing via Sparse Tensor Decomposition
by: Üyük, Cem, et al.
Published: (2024)
by: Üyük, Cem, et al.
Published: (2024)
Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior
by: Arturi, Daniel Aarao Reis, et al.
Published: (2025)
by: Arturi, Daniel Aarao Reis, et al.
Published: (2025)
Can Interpretation Predict Behavior on Unseen Data?
by: Li, Victoria R., et al.
Published: (2025)
by: Li, Victoria R., et al.
Published: (2025)
Similar Items
-
Shared Global and Local Geometry of Language Model Embeddings
by: Lee, Andrew, et al.
Published: (2025) -
Decomposing Query-Key Feature Interactions Using Contrastive Covariances
by: Lee, Andrew, et al.
Published: (2026) -
Relational Composition in Neural Networks: A Survey and Call to Action
by: Wattenberg, Martin, et al.
Published: (2024) -
Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
by: Li, Kenneth, et al.
Published: (2024) -
The Geometry of Self-Verification in a Task-Specific Reasoning Model
by: Lee, Andrew, et al.
Published: (2025)