Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zeng, Runjia, Wang, Qifan, Guan, Qiang, Tang, Ruixiang, Huang, Lifu, Wang, Zhenting, Zhang, Xueling, Han, Cheng, Liu, Dongfang
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.19739
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917225508634624
author	Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang
author_facet	Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang
contents	Fine tuning has been regarded as a de facto approach for adapting large language models (LLMs) to downstream tasks, but the high training memory consumption inherited from LLMs makes this process inefficient. Among existing memory efficient approaches, activation-related optimization has proven particularly effective, as activations consistently dominate overall memory consumption. Although prior arts offer various activation optimization strategies, their data-agnostic nature ultimately results in ineffective and unstable fine tuning. In this paper, we propose TokenSeek, a universal plugin solution for various transformer-based models through instance-aware token seeking and ditching, achieving significant fine-tuning memory savings (e.g., requiring only 14.8% of the memory on Llama3.2 1B) with on-par or even better performance. Furthermore, our interpretable token seeking process reveals the underlying reasons for its effectiveness, offering valuable insights for future research on token efficiency. Homepage: https://runjia.tech/iclr_tokenseek/
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_19739
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang Computation and Language Artificial Intelligence Fine tuning has been regarded as a de facto approach for adapting large language models (LLMs) to downstream tasks, but the high training memory consumption inherited from LLMs makes this process inefficient. Among existing memory efficient approaches, activation-related optimization has proven particularly effective, as activations consistently dominate overall memory consumption. Although prior arts offer various activation optimization strategies, their data-agnostic nature ultimately results in ineffective and unstable fine tuning. In this paper, we propose TokenSeek, a universal plugin solution for various transformer-based models through instance-aware token seeking and ditching, achieving significant fine-tuning memory savings (e.g., requiring only 14.8% of the memory on Llama3.2 1B) with on-par or even better performance. Furthermore, our interpretable token seeking process reveals the underlying reasons for its effectiveness, offering valuable insights for future research on token efficiency. Homepage: https://runjia.tech/iclr_tokenseek/
title	TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2601.19739

Similar Items