Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.19739 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866917225508634624 |
|---|---|
| author | Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang |
| author_facet | Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang |
| contents | Fine tuning has been regarded as a de facto approach for adapting large language models (LLMs) to downstream tasks, but the high training memory consumption inherited from LLMs makes this process inefficient. Among existing memory efficient approaches, activation-related optimization has proven particularly effective, as activations consistently dominate overall memory consumption. Although prior arts offer various activation optimization strategies, their data-agnostic nature ultimately results in ineffective and unstable fine tuning. In this paper, we propose TokenSeek, a universal plugin solution for various transformer-based models through instance-aware token seeking and ditching, achieving significant fine-tuning memory savings (e.g., requiring only 14.8% of the memory on Llama3.2 1B) with on-par or even better performance. Furthermore, our interpretable token seeking process reveals the underlying reasons for its effectiveness, offering valuable insights for future research on token efficiency. Homepage: https://runjia.tech/iclr_tokenseek/ |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2601_19739 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching Zeng, Runjia Wang, Qifan Guan, Qiang Tang, Ruixiang Huang, Lifu Wang, Zhenting Zhang, Xueling Han, Cheng Liu, Dongfang Computation and Language Artificial Intelligence Fine tuning has been regarded as a de facto approach for adapting large language models (LLMs) to downstream tasks, but the high training memory consumption inherited from LLMs makes this process inefficient. Among existing memory efficient approaches, activation-related optimization has proven particularly effective, as activations consistently dominate overall memory consumption. Although prior arts offer various activation optimization strategies, their data-agnostic nature ultimately results in ineffective and unstable fine tuning. In this paper, we propose TokenSeek, a universal plugin solution for various transformer-based models through instance-aware token seeking and ditching, achieving significant fine-tuning memory savings (e.g., requiring only 14.8% of the memory on Llama3.2 1B) with on-par or even better performance. Furthermore, our interpretable token seeking process reveals the underlying reasons for its effectiveness, offering valuable insights for future research on token efficiency. Homepage: https://runjia.tech/iclr_tokenseek/ |
| title | TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching |
| topic | Computation and Language Artificial Intelligence |
| url | https://arxiv.org/abs/2601.19739 |