Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Mao, Yansheng, Xu, Yufei, Li, Jiaqi, Meng, Fanxu, Yang, Haotong, Zheng, Zilong, Wang, Xiyuan, Zhang, Muhan
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.14644
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911585411268608
author	Mao, Yansheng Xu, Yufei Li, Jiaqi Meng, Fanxu Yang, Haotong Zheng, Zilong Wang, Xiyuan Zhang, Muhan
author_facet	Mao, Yansheng Xu, Yufei Li, Jiaqi Meng, Fanxu Yang, Haotong Zheng, Zilong Wang, Xiyuan Zhang, Muhan
contents	Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT), a novel framework for long-context modeling that can enhance the long-context performance of arbitrary short-context LLMs by dynamically adapting their parameters to the given long input. Importantly, rather than endlessly extending the context window size to accommodate increasingly longer inputs in context, LIFT stores and absorbs the long input in parameters. By fine-tuning the long input into model parameters, LIFT allows short-context LLMs to answer questions even when the required information is not provided in the context during inference, avoiding the quadratic complexity w.r.t. input length of a normal long context model. Furthermore, LIFT does not simply perform continued pretraining on new, long contexts, but leverages carefully designed LLM-generated synthetic tasks to enhance the comprehension of long contexts, moving beyond mere memorization. To accommodate the additional cost of fine-tuning, we design a highly optimized pipeline that reduces the Time to First Token (TTFT) to less than 10 seconds for 8k context. We further provide a comprehensive analysis of LIFT's strengths and limitations in long-context understanding, discuss its feasibility for large-scale real-world deployment, and highlight valuable directions for future research.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_14644
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	LIFT: A Novel Framework for Enhancing Long-Context Understanding of LLMs via Long Input Fine-Tuning Mao, Yansheng Xu, Yufei Li, Jiaqi Meng, Fanxu Yang, Haotong Zheng, Zilong Wang, Xiyuan Zhang, Muhan Computation and Language Long context understanding remains challenging for large language models due to their limited context windows. This paper introduces Long Input Fine-Tuning (LIFT), a novel framework for long-context modeling that can enhance the long-context performance of arbitrary short-context LLMs by dynamically adapting their parameters to the given long input. Importantly, rather than endlessly extending the context window size to accommodate increasingly longer inputs in context, LIFT stores and absorbs the long input in parameters. By fine-tuning the long input into model parameters, LIFT allows short-context LLMs to answer questions even when the required information is not provided in the context during inference, avoiding the quadratic complexity w.r.t. input length of a normal long context model. Furthermore, LIFT does not simply perform continued pretraining on new, long contexts, but leverages carefully designed LLM-generated synthetic tasks to enhance the comprehension of long contexts, moving beyond mere memorization. To accommodate the additional cost of fine-tuning, we design a highly optimized pipeline that reduces the Time to First Token (TTFT) to less than 10 seconds for 8k context. We further provide a comprehensive analysis of LIFT's strengths and limitations in long-context understanding, discuss its feasibility for large-scale real-world deployment, and highlight valuable directions for future research.
title	LIFT: A Novel Framework for Enhancing Long-Context Understanding of LLMs via Long Input Fine-Tuning
topic	Computation and Language
url	https://arxiv.org/abs/2502.14644

Similar Items