:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Costarelli, Anthony, Allen, Mat, Field, Severin
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.02472
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Intrinsic-Energy Joint Embedding Predictive Architectures Induce Quasimetric Spaces
by: Kobanda, Anthony, et al.
Published: (2026)

ctELM: Decoding and Manipulating Embeddings of Clinical Trials with Embedding Language Models
by: Ondov, Brian, et al.
Published: (2026)

Beyond Isolated Clients: Integrating Graph-Based Embeddings into Event Sequence Models
by: Proshian, Harry, et al.
Published: (2026)

SelfIE: Self-Interpretation of Large Language Model Embeddings
by: Chen, Haozhe, et al.
Published: (2024)

The Dual-Stream Transformer: Channelized Architecture for Interpretable Language Modeling
by: Kerce, J. Clayton, et al.
Published: (2026)

Concept Tokens: Learning Behavioral Embeddings Through Concept Definitions
by: Sastre, Ignacio, et al.
Published: (2026)

A General Framework for Producing Interpretable Semantic Text Embeddings
by: Sun, Yiqun, et al.
Published: (2024)

Interpreting Language Models Through Concept Descriptions: A Survey
by: Feldhus, Nils, et al.
Published: (2025)

Interpretable Robot Control via Structured Behavior Trees and Large Language Models
by: Chekam, Ingrid Maéva, et al.
Published: (2025)

GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents
by: Costarelli, Anthony, et al.
Published: (2024)

Hyperdimensional Probe: Decoding LLM Representations via Vector Symbolic Architectures
by: Bronzini, Marco, et al.
Published: (2025)

Mechanistic Interpretability of Fine-Tuned Vision Transformers on Distorted Images: Decoding Attention Head Behavior for Transparent and Trustworthy AI
by: Bahador, Nooshin
Published: (2025)

Meta Additive Model: Interpretable Sparse Learning With Auto Weighting
by: Zhang, Xuelin, et al.
Published: (2026)

Discovering Chunks in Neural Embeddings for Interpretability
by: Wu, Shuchen, et al.
Published: (2025)

Interpretable Perturbation Modeling Through Biomedical Knowledge Graphs
by: Passigan, Pascal, et al.
Published: (2025)

Integrating Meta-Features with Knowledge Graph Embeddings for Meta-Learning
by: Klironomos, Antonis, et al.
Published: (2026)

EmbedLLM: Learning Compact Representations of Large Language Models
by: Zhuang, Richard, et al.
Published: (2024)

Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model
by: Wang, Jin, et al.
Published: (2024)

BEACON: Behavioral Malware Classification with Large Language Model Embeddings and Deep Learning
by: Perera, Wadduwage Shanika, et al.
Published: (2025)

Decoding Latent Spaces: Assessing the Interpretability of Time Series Foundation Models for Visual Analytics
by: Santamaria-Valenzuela, Inmaculada, et al.
Published: (2025)

Nature Language Model: Deciphering the Language of Nature for Scientific Discovery
by: Xia, Yingce, et al.
Published: (2025)

AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents
by: Xu, Shuyuan, et al.
Published: (2024)

Interpreting Outliers in Time Series Data through Decoding Autoencoder
by: Knab, Patrick, et al.
Published: (2024)

Tokenized Bandit for LLM Decoding and Alignment
by: Shin, Suho, et al.
Published: (2025)

Beyond the Answer: Decoding the Behavior of LLMs as Scientific Reasoners
by: Pandey, Rohan, et al.
Published: (2026)

LLM4GNAS: A Large Language Model Based Toolkit for Graph Neural Architecture Search
by: Gao, Yang, et al.
Published: (2025)

Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability
by: Bhalla, Usha, et al.
Published: (2025)

BONSAI: Bayesian Optimization with Natural Simplicity and Interpretability
by: Daulton, Samuel, et al.
Published: (2026)

SemCSE-Multi: Multifaceted and Decodable Embeddings for Aspect-Specific and Interpretable Scientific Domain Mapping
by: Brinner, Marc, et al.
Published: (2025)

Why do Experts Disagree on Existential Risk and P(doom)? A Survey of AI Experts
by: Field, Severin
Published: (2025)

Interpretable-by-Design Transformers via Architectural Stream Independence
by: Kerce, Clayton, et al.
Published: (2026)

Distributed Interpretability and Control for Large Language Models
by: Desai, Dev Arpan, et al.
Published: (2026)

LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection
by: Bal-Ghaoui, Mohamed, et al.
Published: (2025)

Meta-cognitive Multi-scale Hierarchical Reasoning for Motor Imagery Decoding
by: Kim, Si-Hyun, et al.
Published: (2025)

Interpretable Embeddings with Sparse Autoencoders: A Data Analysis Toolkit
by: Jiang, Nick, et al.
Published: (2025)

GradMetaNet: An Equivariant Architecture for Learning on Gradients
by: Gelberg, Yoav, et al.
Published: (2025)

Embedded Quantum Machine Learning in Embedded Systems: Feasibility, Hybrid Architectures, and Quantum Co-Processors
by: Dey, Somdip, et al.
Published: (2026)

NEZHA: A Zero-sacrifice and Hyperspeed Decoding Architecture for Generative Recommendations
by: Wang, Yejing, et al.
Published: (2025)

CBMAS: Cognitive Behavioral Modeling via Activation Steering
by: Ismail, Ahmed H., et al.
Published: (2026)

Pragmatic Policy Development via Interpretable Behavior Cloning
by: Matsson, Anton, et al.
Published: (2025)