Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Newgas, Adam
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2507.09816
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918090986487808
author	Newgas, Adam
author_facet	Newgas, Adam
contents	Neural networks are capable of superposition -- representing more features than there are dimensions. Recent work considers the analogous concept for computation instead of storage, proposing theoretical constructions. But there has been little investigation into whether these circuits can be learned in practice. In this work, we investigate a toy model for the Universal-AND problem which computes the AND of all $m\choose 2$ pairs of $m$ sparse inputs. The hidden dimension that determines the number of non-linear activations is restricted to pressure the model to find a compute-efficient circuit, called compressed computation. We find that the training process finds a simple solution that does not correspond to theoretical constructions. It is fully dense -- every neuron contributes to every output. The solution circuit naturally scales with dimension, trading off error rates for neuron efficiency. It is similarly robust to changes in sparsity and other key parameters, and extends naturally to other boolean operations and boolean circuits. We explain the found solution in detail and compute why it is more efficient than the theoretical constructions at low sparsity. Our findings shed light on the types of circuits that models like to form and the flexibility of the superposition representation. This contributes to a broader understanding of network circuitry and interpretability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_09816
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Compressed Computation: Dense Circuits in a Toy Model of the Universal-AND Problem Newgas, Adam Machine Learning Artificial Intelligence Neural networks are capable of superposition -- representing more features than there are dimensions. Recent work considers the analogous concept for computation instead of storage, proposing theoretical constructions. But there has been little investigation into whether these circuits can be learned in practice. In this work, we investigate a toy model for the Universal-AND problem which computes the AND of all $m\choose 2$ pairs of $m$ sparse inputs. The hidden dimension that determines the number of non-linear activations is restricted to pressure the model to find a compute-efficient circuit, called compressed computation. We find that the training process finds a simple solution that does not correspond to theoretical constructions. It is fully dense -- every neuron contributes to every output. The solution circuit naturally scales with dimension, trading off error rates for neuron efficiency. It is similarly robust to changes in sparsity and other key parameters, and extends naturally to other boolean operations and boolean circuits. We explain the found solution in detail and compute why it is more efficient than the theoretical constructions at low sparsity. Our findings shed light on the types of circuits that models like to form and the flexibility of the superposition representation. This contributes to a broader understanding of network circuitry and interpretability.
title	Compressed Computation: Dense Circuits in a Toy Model of the Universal-AND Problem
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2507.09816

Similar Items