Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rodionov, Gleb, Prokhorenkova, Liudmila
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2402.11628
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909734491127808
author	Rodionov, Gleb Prokhorenkova, Liudmila
author_facet	Rodionov, Gleb Prokhorenkova, Liudmila
contents	Neural algorithmic reasoning aims to capture computations with neural networks by training models to imitate the execution of classical algorithms. While common architectures are expressive enough to contain the correct model in the weight space, current neural reasoners struggle to generalize well on out-of-distribution data. On the other hand, classical computations are not affected by distributional shifts as they can be described as transitions between discrete computational states. In this work, we propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. To achieve this, we separate discrete and continuous data flows and describe the interaction between them. Trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm. To show this, we evaluate our approach on multiple algorithmic problems and achieve perfect test scores both in single-task and multitask setups. Moreover, the proposed architectural choice allows us to prove the correctness of the learned algorithms for any test data.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_11628
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Discrete Neural Algorithmic Reasoning Rodionov, Gleb Prokhorenkova, Liudmila Machine Learning Computation and Language Neural algorithmic reasoning aims to capture computations with neural networks by training models to imitate the execution of classical algorithms. While common architectures are expressive enough to contain the correct model in the weight space, current neural reasoners struggle to generalize well on out-of-distribution data. On the other hand, classical computations are not affected by distributional shifts as they can be described as transitions between discrete computational states. In this work, we propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. To achieve this, we separate discrete and continuous data flows and describe the interaction between them. Trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm. To show this, we evaluate our approach on multiple algorithmic problems and achieve perfect test scores both in single-task and multitask setups. Moreover, the proposed architectural choice allows us to prove the correctness of the learned algorithms for any test data.
title	Discrete Neural Algorithmic Reasoning
topic	Machine Learning Computation and Language
url	https://arxiv.org/abs/2402.11628

Similar Items