Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.00535 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917238431285248 |
|---|---|
| author | Zerihun, Liyu Plashchinsky, Alexandr |
| author_facet | Zerihun, Liyu Plashchinsky, Alexandr |
| contents | Long sequence neural memory remains a challenging problem. RNNs and their variants suffer from vanishing gradients, and Transformers suffer from quadratic scaling. Furthermore, compressing long sequences into a finite fixed representation remains an intractable problem due to the difficult optimization landscape. Invertible Memory Flow Networks (IMFN) make long sequence compression tractable through factorization: instead of learning end-to-end compression, we decompose the problem into pairwise merges using a binary tree of "sweeper" modules. Rather than learning to compress long sequences, each sweeper learns a much simpler 2-to-1 compression task, achieving O(log N) depth with sublinear error accumulation in sequence length. For online inference, we distilled into a constant-cost recurrent student achieving O(1) sequential steps. Empirical results validate IMFN on long MNIST sequences and UCF-101 videos, demonstrating compression of high-dimensional data over long sequences. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_00535 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Invertible Memory Flow Networks Zerihun, Liyu Plashchinsky, Alexandr Machine Learning Long sequence neural memory remains a challenging problem. RNNs and their variants suffer from vanishing gradients, and Transformers suffer from quadratic scaling. Furthermore, compressing long sequences into a finite fixed representation remains an intractable problem due to the difficult optimization landscape. Invertible Memory Flow Networks (IMFN) make long sequence compression tractable through factorization: instead of learning end-to-end compression, we decompose the problem into pairwise merges using a binary tree of "sweeper" modules. Rather than learning to compress long sequences, each sweeper learns a much simpler 2-to-1 compression task, achieving O(log N) depth with sublinear error accumulation in sequence length. For online inference, we distilled into a constant-cost recurrent student achieving O(1) sequential steps. Empirical results validate IMFN on long MNIST sequences and UCF-101 videos, demonstrating compression of high-dimensional data over long sequences. |
| title | Invertible Memory Flow Networks |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2602.00535 |