:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Gopalani, Pulkit, Lubana, Ekdeep Singh, Hu, Wei
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2410.22244
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
von: Gopalani, Pulkit, et al.
Veröffentlicht: (2025)

Global Convergence of SGD On Two Layer Neural Nets
von: Gopalani, Pulkit, et al.
Veröffentlicht: (2022)

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
von: Ramesh, Rahul, et al.
Veröffentlicht: (2023)

Representation Shattering in Transformers: A Synthetic Study with Knowledge Editing
von: Nishi, Kento, et al.
Veröffentlicht: (2024)

A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
von: Lubana, Ekdeep Singh, et al.
Veröffentlicht: (2024)

Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets
von: Gopalani, Pulkit, et al.
Veröffentlicht: (2023)

How Do LLMs Persuade? Linear Probes Can Uncover Persuasion Dynamics in Multi-Turn Conversations
von: Jaipersaud, Brandon, et al.
Veröffentlicht: (2025)

Swing-by Dynamics in Concept Learning and Compositional Generalization
von: Yang, Yongyi, et al.
Veröffentlicht: (2024)

Analyzing (In)Abilities of SAEs via Formal Languages
von: Menon, Abhinav, et al.
Veröffentlicht: (2024)

Competition Dynamics Shape Algorithmic Phases of In-Context Learning
von: Park, Core Francisco, et al.
Veröffentlicht: (2024)

Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task
von: Okawa, Maya, et al.
Veröffentlicht: (2023)

Towards Size-Independent Generalization Bounds for Deep Operator Nets
von: Gopalani, Pulkit, et al.
Veröffentlicht: (2022)

Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space
von: Park, Core Francisco, et al.
Veröffentlicht: (2024)

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
von: Costa, Valérie, et al.
Veröffentlicht: (2025)

Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit
von: Costa, Valérie, et al.
Veröffentlicht: (2025)

Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
von: Hindupur, Sai Sumedh R., et al.
Veröffentlicht: (2025)

Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
von: Khona, Mikail, et al.
Veröffentlicht: (2024)

In-Context Learning Strategies Emerge Rationally
von: Wurgaft, Daniel, et al.
Veröffentlicht: (2025)

In-Context Learning Dynamics with Random Binary Sequences
von: Bigelow, Eric J., et al.
Veröffentlicht: (2023)

The Impact of Off-Policy Training Data on Probe Generalisation
von: Kirch, Nathalie, et al.
Veröffentlicht: (2025)

What Makes and Breaks Safety Fine-tuning? A Mechanistic Study
von: Jain, Samyak, et al.
Veröffentlicht: (2024)

From Isolation to Entanglement: When Do Interpretability Methods Identify and Disentangle Known Concepts?
von: Mueller, Aaron, et al.
Veröffentlicht: (2025)

Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering
von: Bigelow, Eric, et al.
Veröffentlicht: (2025)

Detecting High-Stakes Interactions with Activation Probes
von: McKenzie, Alex, et al.
Veröffentlicht: (2025)

ICLR: In-Context Learning of Representations
von: Park, Core Francisco, et al.
Veröffentlicht: (2024)

Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability
von: Prasad, Aaditya Vikram, et al.
Veröffentlicht: (2026)

Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
von: Bigelow, Eric, et al.
Veröffentlicht: (2026)

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention
von: Huang, Jing, et al.
Veröffentlicht: (2026)

Emergence of Hierarchical Emotion Organization in Large Language Models
von: Zhao, Bo, et al.
Veröffentlicht: (2025)

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks
von: Jain, Samyak, et al.
Veröffentlicht: (2023)

Dac-Fake: A Divide and Conquer Framework for Detecting Fake News on Social Media
von: Jain, Mayank Kumar, et al.
Veröffentlicht: (2025)

Do Sparse Autoencoders Capture Concept Manifolds?
von: Bhalla, Usha, et al.
Veröffentlicht: (2026)

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
von: Reuss, Moritz, et al.
Veröffentlicht: (2024)

Representational Transfer Learning for Matrix Completion
von: He, Yong, et al.
Veröffentlicht: (2024)

Vegetable Peeling: A Case Study in Constrained Dexterous Manipulation
von: Chen, Tao, et al.
Veröffentlicht: (2024)

Online Policy Learning and Inference by Matrix Completion
von: Duan, Congyuan, et al.
Veröffentlicht: (2024)

Truncated Matrix Completion - An Empirical Study
von: Naik, Rishhabh, et al.
Veröffentlicht: (2025)

TGRL: An Algorithm for Teacher Guided Reinforcement Learning
von: Shenfeld, Idan, et al.
Veröffentlicht: (2023)

Optimal Transfer Learning for Missing Not-at-Random Matrix Completion
von: Jalan, Akhil, et al.
Veröffentlicht: (2025)

RL's Razor: Why Online Reinforcement Learning Forgets Less
von: Shenfeld, Idan, et al.
Veröffentlicht: (2025)