Saved in:
Bibliographic Details
Main Authors: Rodionov, Gleb, Prokhorenkova, Liudmila
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.11628
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909734491127808
author Rodionov, Gleb
Prokhorenkova, Liudmila
author_facet Rodionov, Gleb
Prokhorenkova, Liudmila
contents Neural algorithmic reasoning aims to capture computations with neural networks by training models to imitate the execution of classical algorithms. While common architectures are expressive enough to contain the correct model in the weight space, current neural reasoners struggle to generalize well on out-of-distribution data. On the other hand, classical computations are not affected by distributional shifts as they can be described as transitions between discrete computational states. In this work, we propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. To achieve this, we separate discrete and continuous data flows and describe the interaction between them. Trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm. To show this, we evaluate our approach on multiple algorithmic problems and achieve perfect test scores both in single-task and multitask setups. Moreover, the proposed architectural choice allows us to prove the correctness of the learned algorithms for any test data.
format Preprint
id arxiv_https___arxiv_org_abs_2402_11628
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Discrete Neural Algorithmic Reasoning
Rodionov, Gleb
Prokhorenkova, Liudmila
Machine Learning
Computation and Language
Neural algorithmic reasoning aims to capture computations with neural networks by training models to imitate the execution of classical algorithms. While common architectures are expressive enough to contain the correct model in the weight space, current neural reasoners struggle to generalize well on out-of-distribution data. On the other hand, classical computations are not affected by distributional shifts as they can be described as transitions between discrete computational states. In this work, we propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. To achieve this, we separate discrete and continuous data flows and describe the interaction between them. Trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm. To show this, we evaluate our approach on multiple algorithmic problems and achieve perfect test scores both in single-task and multitask setups. Moreover, the proposed architectural choice allows us to prove the correctness of the learned algorithms for any test data.
title Discrete Neural Algorithmic Reasoning
topic Machine Learning
Computation and Language
url https://arxiv.org/abs/2402.11628