Saved in:
Bibliographic Details
Main Authors: Menezes, Michael, Kyrillidis, Anastasios
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.11408
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915794137382912
author Menezes, Michael
Kyrillidis, Anastasios
author_facet Menezes, Michael
Kyrillidis, Anastasios
contents While Mamba2's expanded state dimension enhances temporal modeling, it incurs substantial inference overhead that saturates bandwidth during autoregressive generation. Standard pruning methods fail to address this bottleneck: unstructured sparsity leaves activations dense, magnitude-based selection ignores runtime dynamics, and gradient-based methods impose prohibitive costs. We introduce GHOST (Grouped Hidden-state Output-aware Selection and Truncation), a structured pruning framework that approximates control-theoretic balanced truncation using only forward-pass statistics. By jointly measuring controllability and observability, GHOST rivals the fidelity of gradient-based methods without requiring backpropagation. As a highlight, on models ranging from 130M to 2.7B parameters, our approach achieves a 50\% state-dimension reduction with approximately 1 perplexity point increase on WikiText-2. Code is available at https://anonymous.4open.science/r/mamba2_ghost-7BCB/.
format Preprint
id arxiv_https___arxiv_org_abs_2602_11408
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle GHOST: Unmasking Phantom States in Mamba2 via Grouped Hidden-state Output-aware Selection & Truncation
Menezes, Michael
Kyrillidis, Anastasios
Artificial Intelligence
Machine Learning
Systems and Control
While Mamba2's expanded state dimension enhances temporal modeling, it incurs substantial inference overhead that saturates bandwidth during autoregressive generation. Standard pruning methods fail to address this bottleneck: unstructured sparsity leaves activations dense, magnitude-based selection ignores runtime dynamics, and gradient-based methods impose prohibitive costs. We introduce GHOST (Grouped Hidden-state Output-aware Selection and Truncation), a structured pruning framework that approximates control-theoretic balanced truncation using only forward-pass statistics. By jointly measuring controllability and observability, GHOST rivals the fidelity of gradient-based methods without requiring backpropagation. As a highlight, on models ranging from 130M to 2.7B parameters, our approach achieves a 50\% state-dimension reduction with approximately 1 perplexity point increase on WikiText-2. Code is available at https://anonymous.4open.science/r/mamba2_ghost-7BCB/.
title GHOST: Unmasking Phantom States in Mamba2 via Grouped Hidden-state Output-aware Selection & Truncation
topic Artificial Intelligence
Machine Learning
Systems and Control
url https://arxiv.org/abs/2602.11408