Saved in:
Bibliographic Details
Main Authors: Tie, Guiyao, Zhao, Zeli, Song, Dingjie, Wei, Fuyang, Zhou, Rong, Dai, Yurou, Yin, Wen, Yang, Zhejian, Yan, Jiangyue, Su, Yao, Dai, Zhenhan, Xie, Yifeng, Cao, Yihan, Sun, Lichao, Zhou, Pan, He, Lifang, Chen, Hechang, Zhang, Yu, Wen, Qingsong, Liu, Tianming, Gong, Neil Zhenqiang, Tang, Jiliang, Xiong, Caiming, Ji, Heng, Yu, Philip S., Gao, Jianfeng
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.06072
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • The emergence of Large Language Models (LLMs) has fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration. However, their pre-trained architectures often reveal limitations in specialized contexts, including restricted reasoning capacities, ethical uncertainties, and suboptimal domain-specific performance. These challenges necessitate advanced post-training language models (PoLMs) to address these shortcomings, such as OpenAI-o1/o3 and DeepSeek-R1 (collectively known as Large Reasoning Models, or LRMs). This paper presents the first comprehensive survey of PoLMs, systematically tracing their evolution across five core paradigms: Fine-tuning, which enhances task-specific accuracy; Alignment, which ensures ethical coherence and alignment with human preferences; Reasoning, which advances multi-step inference despite challenges in reward design; Efficiency, which optimizes resource utilization amidst increasing complexity; Integration and Adaptation, which extend capabilities across diverse modalities while addressing coherence issues. Charting progress from ChatGPT's alignment strategies to DeepSeek-R1's innovative reasoning advancements, we illustrate how PoLMs leverage datasets to mitigate biases, deepen reasoning capabilities, and enhance domain adaptability. Our contributions include a pioneering synthesis of PoLM evolution, a structured taxonomy categorizing techniques and datasets, and a strategic agenda emphasizing the role of LRMs in improving reasoning proficiency and domain flexibility. As the first survey of its scope, this work consolidates recent PoLM advancements and establishes a rigorous intellectual framework for future research, fostering the development of LLMs that excel in precision, ethical robustness, and versatility across scientific and societal applications.