Salvato in:
Dettagli Bibliografici
Autori principali: Tie, Guiyao, Zhao, Zeli, Song, Dingjie, Wei, Fuyang, Zhou, Rong, Dai, Yurou, Yin, Wen, Yang, Zhejian, Yan, Jiangyue, Su, Yao, Dai, Zhenhan, Xie, Yifeng, Cao, Yihan, Sun, Lichao, Zhou, Pan, He, Lifang, Chen, Hechang, Zhang, Yu, Wen, Qingsong, Liu, Tianming, Gong, Neil Zhenqiang, Tang, Jiliang, Xiong, Caiming, Ji, Heng, Yu, Philip S., Gao, Jianfeng
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2503.06072
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866913968997531648
author Tie, Guiyao
Zhao, Zeli
Song, Dingjie
Wei, Fuyang
Zhou, Rong
Dai, Yurou
Yin, Wen
Yang, Zhejian
Yan, Jiangyue
Su, Yao
Dai, Zhenhan
Xie, Yifeng
Cao, Yihan
Sun, Lichao
Zhou, Pan
He, Lifang
Chen, Hechang
Zhang, Yu
Wen, Qingsong
Liu, Tianming
Gong, Neil Zhenqiang
Tang, Jiliang
Xiong, Caiming
Ji, Heng
Yu, Philip S.
Gao, Jianfeng
author_facet Tie, Guiyao
Zhao, Zeli
Song, Dingjie
Wei, Fuyang
Zhou, Rong
Dai, Yurou
Yin, Wen
Yang, Zhejian
Yan, Jiangyue
Su, Yao
Dai, Zhenhan
Xie, Yifeng
Cao, Yihan
Sun, Lichao
Zhou, Pan
He, Lifang
Chen, Hechang
Zhang, Yu
Wen, Qingsong
Liu, Tianming
Gong, Neil Zhenqiang
Tang, Jiliang
Xiong, Caiming
Ji, Heng
Yu, Philip S.
Gao, Jianfeng
contents The emergence of Large Language Models (LLMs) has fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration. However, their pre-trained architectures often reveal limitations in specialized contexts, including restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance. These challenges necessitate advanced post-training language models (PoLMs) to address these shortcomings, such as OpenAI-o1/o3 and DeepSeek-R1 (collectively known as Large Reasoning Models, or LRMs). This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms: Fine-tuning, which enhances task-specific accuracy; Alignment, which ensures ethical coherence and alignment with human preferences; Reasoning, which advances multi-step inference despite challenges in reward design; Efficiency, which optimizes resource utilization amidst increasing complexity; Integration and Adaptation, which extend capabilities across diverse modalities while addressing coherence issues. Charting progress from ChatGPT's alignment strategies to DeepSeek-R1's innovative reasoning advancements, we illustrate how PoLMs leverage datasets to mitigate biases, deepen reasoning capabilities, and enhance domain adaptability. Our contributions include a pioneering synthesis of PoLM evolution, a structured taxonomy categorizing techniques and datasets, and a strategic agenda emphasizing the role of LRMs in improving reasoning proficiency and domain flexibility. As the first survey of its scope, this work consolidates recent PoLM advancements and establishes a rigorous intellectual framework for future research, fostering the development of LLMs that excel in precision, ethical robustness, and versatility across scientific and societal applications.
format Preprint
id arxiv_https___arxiv_org_abs_2503_06072
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Survey on Post-training of Large Language Models
Tie, Guiyao
Zhao, Zeli
Song, Dingjie
Wei, Fuyang
Zhou, Rong
Dai, Yurou
Yin, Wen
Yang, Zhejian
Yan, Jiangyue
Su, Yao
Dai, Zhenhan
Xie, Yifeng
Cao, Yihan
Sun, Lichao
Zhou, Pan
He, Lifang
Chen, Hechang
Zhang, Yu
Wen, Qingsong
Liu, Tianming
Gong, Neil Zhenqiang
Tang, Jiliang
Xiong, Caiming
Ji, Heng
Yu, Philip S.
Gao, Jianfeng
Computation and Language
Artificial Intelligence
The emergence of Large Language Models (LLMs) has fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration. However, their pre-trained architectures often reveal limitations in specialized contexts, including restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance. These challenges necessitate advanced post-training language models (PoLMs) to address these shortcomings, such as OpenAI-o1/o3 and DeepSeek-R1 (collectively known as Large Reasoning Models, or LRMs). This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms: Fine-tuning, which enhances task-specific accuracy; Alignment, which ensures ethical coherence and alignment with human preferences; Reasoning, which advances multi-step inference despite challenges in reward design; Efficiency, which optimizes resource utilization amidst increasing complexity; Integration and Adaptation, which extend capabilities across diverse modalities while addressing coherence issues. Charting progress from ChatGPT's alignment strategies to DeepSeek-R1's innovative reasoning advancements, we illustrate how PoLMs leverage datasets to mitigate biases, deepen reasoning capabilities, and enhance domain adaptability. Our contributions include a pioneering synthesis of PoLM evolution, a structured taxonomy categorizing techniques and datasets, and a strategic agenda emphasizing the role of LRMs in improving reasoning proficiency and domain flexibility. As the first survey of its scope, this work consolidates recent PoLM advancements and establishes a rigorous intellectual framework for future research, fostering the development of LLMs that excel in precision, ethical robustness, and versatility across scientific and societal applications.
title A Survey on Post-training of Large Language Models
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2503.06072