Saved in:
| Main Authors: | Sawmya, Shashata, Adler, Micah, Shavit, Nir |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.19440 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Wasserstein Distances, Neuronal Entanglement, and Sparsity
by: Sawmya, Shashata, et al.
Published: (2024)
by: Sawmya, Shashata, et al.
Published: (2024)
Cascade Detector Analysis and Application to Biomedical Microscopy
by: Athey, Thomas L., et al.
Published: (2025)
by: Athey, Thomas L., et al.
Published: (2025)
NeuroADDA: Active Discriminative Domain Adaptation in Connectomic
by: Sawmya, Shashata, et al.
Published: (2025)
by: Sawmya, Shashata, et al.
Published: (2025)
Towards Combinatorial Interpretability of Neural Computation
by: Adler, Micah, et al.
Published: (2025)
by: Adler, Micah, et al.
Published: (2025)
Negative Pre-activations Differentiate Syntax
by: Kong, Linghao, et al.
Published: (2025)
by: Kong, Linghao, et al.
Published: (2025)
Understanding Empirical Unlearning with Combinatorial Interpretability
by: Kodama, Shingo, et al.
Published: (2026)
by: Kodama, Shingo, et al.
Published: (2026)
Learning to Interpret Weight Differences in Language Models
by: Goel, Avichal, et al.
Published: (2025)
by: Goel, Avichal, et al.
Published: (2025)
On the Complexity of Neural Computation in Superposition
by: Adler, Micah, et al.
Published: (2024)
by: Adler, Micah, et al.
Published: (2024)
Expand Neurons, Not Parameters
by: Kong, Linghao, et al.
Published: (2025)
by: Kong, Linghao, et al.
Published: (2025)
Toy Combinatorial Interpretability Models Reveal Lottery Tickets in Early Feature Space
by: Bebchuk, Alon, et al.
Published: (2026)
by: Bebchuk, Alon, et al.
Published: (2026)
Forbidden Facts: An Investigation of Competing Objectives in Llama-2
by: Wang, Tony T., et al.
Published: (2023)
by: Wang, Tony T., et al.
Published: (2023)
Semantic Structure of Feature Space in Large Language Models
by: Kozlowski, Austin C., et al.
Published: (2026)
by: Kozlowski, Austin C., et al.
Published: (2026)
Emergent Abilities in Reduced-Scale Generative Language Models
by: Muckatira, Sherin, et al.
Published: (2024)
by: Muckatira, Sherin, et al.
Published: (2024)
Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models
by: Yang, Junjie, et al.
Published: (2025)
by: Yang, Junjie, et al.
Published: (2025)
Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models
by: Lee, Joseph, et al.
Published: (2024)
by: Lee, Joseph, et al.
Published: (2024)
Emergent Abilities in Large Language Models: A Survey
by: Berti, Leonardo, et al.
Published: (2025)
by: Berti, Leonardo, et al.
Published: (2025)
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
by: Liu, Xiaoze, et al.
Published: (2024)
by: Liu, Xiaoze, et al.
Published: (2024)
Rotary Offset Features in Large Language Models
by: Jonasson, André
Published: (2025)
by: Jonasson, André
Published: (2025)
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
by: Jain, Neel, et al.
Published: (2024)
by: Jain, Neel, et al.
Published: (2024)
A Dual-Space Framework for General Knowledge Distillation of Large Language Models
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
Quantifying Feature Space Universality Across Large Language Models via Sparse Autoencoders
by: Lan, Michael, et al.
Published: (2024)
by: Lan, Michael, et al.
Published: (2024)
On the Reliability of Watermarks for Large Language Models
by: Kirchenbauer, John, et al.
Published: (2023)
by: Kirchenbauer, John, et al.
Published: (2023)
Uncovering Emergent Physics Representations Learned In-Context by Large Language Models
by: Song, Yeongwoo, et al.
Published: (2025)
by: Song, Yeongwoo, et al.
Published: (2025)
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models
by: Liang, Kaiqu, et al.
Published: (2025)
by: Liang, Kaiqu, et al.
Published: (2025)
Automatically Interpreting Millions of Features in Large Language Models
by: Paulo, Gonçalo, et al.
Published: (2024)
by: Paulo, Gonçalo, et al.
Published: (2024)
Persistent Topological Features in Large Language Models
by: Gardinazzi, Yuri, et al.
Published: (2024)
by: Gardinazzi, Yuri, et al.
Published: (2024)
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance
by: Chen, Tianlang, et al.
Published: (2025)
by: Chen, Tianlang, et al.
Published: (2025)
Neuron-Level Knowledge Attribution in Large Language Models
by: Yu, Zeping, et al.
Published: (2023)
by: Yu, Zeping, et al.
Published: (2023)
Provable Scaling Laws for the Test-Time Compute of Large Language Models
by: Chen, Yanxi, et al.
Published: (2024)
by: Chen, Yanxi, et al.
Published: (2024)
Task-Stratified Knowledge Scaling Laws for Post-Training Quantized Large Language Models
by: Zhou, Chenxi, et al.
Published: (2025)
by: Zhou, Chenxi, et al.
Published: (2025)
Latent Feature Mining for Predictive Model Enhancement with Large Language Models
by: Li, Bingxuan, et al.
Published: (2024)
by: Li, Bingxuan, et al.
Published: (2024)
Knowledge Distillation from Large Language Models for Household Energy Modeling
by: Takrouri, Mohannad, et al.
Published: (2025)
by: Takrouri, Mohannad, et al.
Published: (2025)
Language Models Represent Space and Time
by: Gurnee, Wes, et al.
Published: (2023)
by: Gurnee, Wes, et al.
Published: (2023)
Emergent Stack Representations in Modeling Counter Languages Using Transformers
by: Tiwari, Utkarsh, et al.
Published: (2025)
by: Tiwari, Utkarsh, et al.
Published: (2025)
Scaling Laws for Discriminative Classification in Large Language Models
by: Wyatte, Dean, et al.
Published: (2024)
by: Wyatte, Dean, et al.
Published: (2024)
A Survey on Symbolic Knowledge Distillation of Large Language Models
by: Acharya, Kamal, et al.
Published: (2024)
by: Acharya, Kamal, et al.
Published: (2024)
Testing Uncertainty of Large Language Models for Physics Knowledge and Reasoning
by: Reganova, Elizaveta, et al.
Published: (2024)
by: Reganova, Elizaveta, et al.
Published: (2024)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models
by: Chua, Lynn, et al.
Published: (2024)
by: Chua, Lynn, et al.
Published: (2024)
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
by: Ruis, Laura, et al.
Published: (2024)
by: Ruis, Laura, et al.
Published: (2024)
Large Language Models Lack Temporal Awareness of Medical Knowledge
by: Guan, Zihan, et al.
Published: (2026)
by: Guan, Zihan, et al.
Published: (2026)
Similar Items
-
Wasserstein Distances, Neuronal Entanglement, and Sparsity
by: Sawmya, Shashata, et al.
Published: (2024) -
Cascade Detector Analysis and Application to Biomedical Microscopy
by: Athey, Thomas L., et al.
Published: (2025) -
NeuroADDA: Active Discriminative Domain Adaptation in Connectomic
by: Sawmya, Shashata, et al.
Published: (2025) -
Towards Combinatorial Interpretability of Neural Computation
by: Adler, Micah, et al.
Published: (2025) -
Negative Pre-activations Differentiate Syntax
by: Kong, Linghao, et al.
Published: (2025)