:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Ou, Weinuo
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.11609
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exact Linear Attention
by: Ou, Weinuo
Published: (2026)

Compressed Context Memory For Online Language Model Interaction
by: Kim, Jang-Hyun, et al.
Published: (2023)

PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models
by: Gupta, Neelesh, et al.
Published: (2024)

Invertible Memory Flow Networks
by: Zerihun, Liyu, et al.
Published: (2026)

Trellis: Learning to Compress Key-Value Memory in Attention Models
by: Karami, Mahdi, et al.
Published: (2025)

Experimental Analysis of Large-scale Learnable Vector Storage Compression
by: Zhang, Hailin, et al.
Published: (2023)

Clustering-driven Memory Compression for On-device Large Language Models
by: Bohdal, Ondrej, et al.
Published: (2026)

Mathematical Formalism for Memory Compression in Selective State Space Models
by: Bhat, Siddhanth
Published: (2024)

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression
by: Li, Kunjun, et al.
Published: (2025)

Towards Compressive and Scalable Recurrent Memory
by: Song, Yunchong, et al.
Published: (2026)

An Efficient Compression of Deep Neural Network Checkpoints Based on Prediction and Context Modeling
by: Kim, Yuriy, et al.
Published: (2025)

Memory Bank Compression for Continual Adaptation of Large Language Models
by: Katraouras, Thomas, et al.
Published: (2026)

Lattice: Learning to Efficiently Compress the Memory
by: Karami, Mahdi, et al.
Published: (2025)

Neural Weight Compression for Language Models
by: Ryu, Jegwang, et al.
Published: (2025)

WorldPack: Compressed Memory Improves Spatial Consistency in Video World Modeling
by: Oshima, Yuta, et al.
Published: (2025)

Auto-FlexSwitch: Efficient Dynamic Model Merging via Learnable Task Vector Compression
by: Gao, Junqi, et al.
Published: (2026)

LoMA: Lossless Compressed Memory Attention
by: Wang, Yumeng, et al.
Published: (2024)

MELODI: Exploring Memory Compression for Long Contexts
by: Chen, Yinpeng, et al.
Published: (2024)

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
by: Hao, Yongchang, et al.
Published: (2024)

MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation
by: Shen, Wei, et al.
Published: (2025)

Efficient Model Compression for Bayesian Neural Networks
by: Saha, Diptarka, et al.
Published: (2024)

Less Memory Means smaller GPUs: Backpropagation with Compressed Activations
by: Barley, Daniel, et al.
Published: (2024)

Chain-of-Thought and Compressed Looped Transformers: A Memory-Budget Separation
by: Zhang, Haozhou
Published: (2026)

Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation
by: Mo, Zihao, et al.
Published: (2024)

BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments
by: Wang, Xinghao, et al.
Published: (2024)

Adaptive Data Compression and Reconstruction for Memory-Bounded EEG Continual Learning
by: Xie, Chengcheng
Published: (2026)

CompAct: Compressed Activations for Memory-Efficient LLM Training
by: Shamshoum, Yara, et al.
Published: (2024)

On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee
by: Li, Chenyang, et al.
Published: (2023)

Robust Learnability of Sample-Compressible Distributions under Noisy or Adversarial Perturbations
by: Boushehrian, Arefe, et al.
Published: (2025)

Language Model Memory and Memory Models for Language
by: Badger, Benjamin L.
Published: (2026)

Goal-Directed Search Outperforms Goal-Agnostic Memory Compression in Long-Context Memory Tasks
by: Zheng, Yicong, et al.
Published: (2025)

Memory-Driven Self-Improvement for Decision Making with Large Language Models
by: Yan, Xue, et al.
Published: (2025)

Hyper-Compression: Model Compression via Hyperfunction
by: Fan, Fenglei, et al.
Published: (2024)

PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training
by: Li, Yanyi, et al.
Published: (2026)

Memory-Efficient Fine-Tuning via Low-Rank Activation Compression
by: Shi, Jiang-Xin, et al.
Published: (2025)

Big2Small: A Unifying Neural Network Framework for Model Compression
by: Liao, Jing-Xiao, et al.
Published: (2026)

Neural Embedding Compression For Efficient Multi-Task Earth Observation Modelling
by: Gomes, Carlos, et al.
Published: (2024)

LightThinker++: From Reasoning Compression to Memory Management
by: Zhu, Yuqi, et al.
Published: (2026)

Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
by: Gorbett, Matt, et al.
Published: (2024)

Forget, Then Recall: Learnable Compression and Selective Unfolding via Gist Sparse Attention
by: Mao, Yuzhen, et al.
Published: (2026)