Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cai, Aichen, Zhang, Anmeng, Li, Anyu, Zhang, Bo, Cai, Bohua, Li, Chang, Jiang, Changjian, Lu, Changkai, Xue, Chao, Liang, Chaocai, Zhang, Cheng, Liu, Dongkai, Wang, Fei, Huang, Guoqiang, Ke, Haijian, Lin, Han, Wang, Hao, Miao, Ji, Zhang, Jiacheng, Shi, Jialong, Zhu, Jifeng, Qian, Jingjing, Luo, Junhui, Xiong, Junwu, So, Lam, Huang, Liang, Ke, Ming, Li, Mingyang, Shi, Panfeng, Hao, Peng, Wang, Qi, Lai, Qian, Yuan, Qiaoqiao, Yin, Qingyu, Cao, Qiong, Wang, Qixiang, Bian, Rongcheng, Han, Rongduo, Zheng, Shaoqiang, Hu, Shi, Suo, Shi, Ren, Shijie, Zhang, Shijin, Fan, Shiying, Xie, Shuai, Zhang, Tianyi, Liu, Wei, Tan, Wentao, Meng, Xianghan, He, Xiaodong, Pan, Xing, Wang, Xiran, Peng, Xuyang, Zhang, Ya, Liu, Yang, Duan, Yangyang, Chen, Yanxu, Gong, Yicheng, Huang, Yidan, Liu, Yifei, Bai, Yinhao, Liu, Yongqiang, Zhang, Yuesong, Zhang, Yuqi, Xie, Zerui, Wang, Zhenfang, Shen, Zhennan, Liu, Zheyuan, Zeng, Zhuwei
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.03044
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

We introduce JoyAI-LLM Flash, an efficient Mixture-of-Experts (MoE) language model designed to redefine the trade-off between strong performance and token efficiency in the sub-50B parameter regime. JoyAI-LLM Flash is pretrained on a massive corpus of 20 trillion tokens and further optimized through a rigorous post-training pipeline, including supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and large-scale reinforcement learning (RL) across diverse environments. To improve token efficiency, JoyAI-LLM Flash strategically balances \emph{thinking} and \emph{non-thinking} cognitive modes and introduces FiberPO, a novel RL algorithm inspired by fibration theory that decomposes trust-region maintenance into global and local components, providing unified multi-scale stability control for LLM policy optimization. To enhance architectural sparsity, the model comprises 48B total parameters while activating only 2.7B parameters per forward pass, achieving a substantially higher sparsity ratio than contemporary industry leading models of comparable scale. To further improve inference throughput, we adopt a joint training-inference co-design that incorporates dense Multi-Token Prediction (MTP) and Quantization-Aware Training (QAT). We release the checkpoints for both JoyAI-LLM-48B-A3B Base and its post-trained variants on Hugging Face to support the open-source community.

Similar Items