Saved in:
| Main Authors: | Liu, Fengze, Zhou, Weidong, Liu, Binbin, Guo, Ping, Wang, Zijun, Zhang, Bingni, Zhang, Yifan, Yu, Yifeng, Zhou, Xiaohuan, Wang, Taifeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.02364 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
by: Liu, Fengze, et al.
Published: (2025)
by: Liu, Fengze, et al.
Published: (2025)
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training
by: Wang, Yifan, et al.
Published: (2025)
by: Wang, Yifan, et al.
Published: (2025)
Target-Oriented Pretraining Data Selection via Neuron-Activated Graph
by: Wang, Zijun, et al.
Published: (2026)
by: Wang, Zijun, et al.
Published: (2026)
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
by: Guo, Ping, et al.
Published: (2025)
by: Guo, Ping, et al.
Published: (2025)
MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
by: Chen, Zhixun, et al.
Published: (2025)
by: Chen, Zhixun, et al.
Published: (2025)
MuBench: Assessment of Multilingual Capabilities of Large Language Models Across 61 Languages
by: Han, Wenhan, et al.
Published: (2025)
by: Han, Wenhan, et al.
Published: (2025)
MathMixup: Boosting LLM Mathematical Reasoning with Difficulty-Controllable Data Synthesis and Curriculum Learning
by: Li, Xuchen, et al.
Published: (2026)
by: Li, Xuchen, et al.
Published: (2026)
Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)
by: Zhao, Guoliang, et al.
Published: (2025)
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
by: Tian, Changxin, et al.
Published: (2025)
by: Tian, Changxin, et al.
Published: (2025)
Scaling Laws for Optimal Data Mixtures
by: Shukor, Mustafa, et al.
Published: (2025)
by: Shukor, Mustafa, et al.
Published: (2025)
Predictable Scale: Part I, Step Law -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining
by: Li, Houyi, et al.
Published: (2025)
by: Li, Houyi, et al.
Published: (2025)
Diagonal Gaussian Mixture Models and Higher Order Tensor Decompositions
by: Guo, Bingni, et al.
Published: (2024)
by: Guo, Bingni, et al.
Published: (2024)
Scaling Law for Quantization-Aware Training
by: Chen, Mengzhao, et al.
Published: (2025)
by: Chen, Mengzhao, et al.
Published: (2025)
Model Merging Scaling Laws in Large Language Models
by: Wang, Yuanyi, et al.
Published: (2025)
by: Wang, Yuanyi, et al.
Published: (2025)
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
by: Ye, Jiasheng, et al.
Published: (2024)
by: Ye, Jiasheng, et al.
Published: (2024)
Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law
by: Chen, Yongming, et al.
Published: (2024)
by: Chen, Yongming, et al.
Published: (2024)
Scaling Laws for Mixture Pretraining Under Data Constraints
by: Sedova, Anastasiia, et al.
Published: (2026)
by: Sedova, Anastasiia, et al.
Published: (2026)
Scaling Laws for Online Advertisement Retrieval
by: Wang, Yunli, et al.
Published: (2024)
by: Wang, Yunli, et al.
Published: (2024)
MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation
by: Yuan, Shen, et al.
Published: (2025)
by: Yuan, Shen, et al.
Published: (2025)
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
by: Que, Haoran, et al.
Published: (2024)
by: Que, Haoran, et al.
Published: (2024)
ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws
by: Li, Ruihang, et al.
Published: (2024)
by: Li, Ruihang, et al.
Published: (2024)
How Should LLMs Consume High-Quality Data? Optimal Data Scheduling via Quality-Aware Functional Scaling Laws
by: Zhu, Zhitao, et al.
Published: (2026)
by: Zhu, Zhitao, et al.
Published: (2026)
Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization
by: Wan, Weilin, et al.
Published: (2026)
by: Wan, Weilin, et al.
Published: (2026)
Quasi-Large Hole Polarons in BiVO4-Implications for Photocatalysis and Solar Energy Conversion
by: Hao, Zhimeng, et al.
Published: (2025)
by: Hao, Zhimeng, et al.
Published: (2025)
Predictable Scale: Part II, Farseer: A Refined Scaling Law in Large Language Models
by: Li, Houyi, et al.
Published: (2025)
by: Li, Houyi, et al.
Published: (2025)
Scaling Laws for Speculative Decoding
by: Yan, Siyuan, et al.
Published: (2025)
by: Yan, Siyuan, et al.
Published: (2025)
Scaling Laws for Fine-Grained Mixture of Experts
by: Krajewski, Jakub, et al.
Published: (2024)
by: Krajewski, Jakub, et al.
Published: (2024)
Generalization and Scaling Laws for Mixture-of-Experts Transformers
by: Mayaki, Mansour Zoubeirou a
Published: (2026)
by: Mayaki, Mansour Zoubeirou a
Published: (2026)
Modernizing Amdahl's Law: How AI Scaling Laws Shape Computer Architecture
by: Lu, Chien-Ping
Published: (2026)
by: Lu, Chien-Ping
Published: (2026)
Wukong: Towards a Scaling Law for Large-Scale Recommendation
by: Zhang, Buyun, et al.
Published: (2024)
by: Zhang, Buyun, et al.
Published: (2024)
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
by: Zeng, Liang, et al.
Published: (2024)
by: Zeng, Liang, et al.
Published: (2024)
Climber: Toward Efficient Scaling Laws for Large Recommendation Models
by: Xu, Songpei, et al.
Published: (2025)
by: Xu, Songpei, et al.
Published: (2025)
Scaling Laws for Predicting Downstream Performance in LLMs
by: Chen, Yangyi, et al.
Published: (2024)
by: Chen, Yangyi, et al.
Published: (2024)
Scaling Laws for Educational AI Agents
by: Wu, Mengsong, et al.
Published: (2026)
by: Wu, Mengsong, et al.
Published: (2026)
P$^2$ Law: Scaling Law for Post-Training After Model Pruning
by: Chen, Xiaodong, et al.
Published: (2024)
by: Chen, Xiaodong, et al.
Published: (2024)
Scaling Laws in Scientific Discovery with AI and Robot Scientists
by: Zhang, Pengsong, et al.
Published: (2025)
by: Zhang, Pengsong, et al.
Published: (2025)
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
by: Zeng, Liang, et al.
Published: (2025)
by: Zeng, Liang, et al.
Published: (2025)
Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization
by: Li, Jingwei, et al.
Published: (2026)
by: Li, Jingwei, et al.
Published: (2026)
Scaling Laws for Upcycling Mixture-of-Experts Language Models
by: Liew, Seng Pei, et al.
Published: (2025)
by: Liew, Seng Pei, et al.
Published: (2025)
An Empirical Study of Scaling Law for OCR
by: Rang, Miao, et al.
Published: (2023)
by: Rang, Miao, et al.
Published: (2023)
Similar Items
-
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
by: Liu, Fengze, et al.
Published: (2025) -
TiKMiX: Take Data Influence into Dynamic Mixture for Language Model Pre-training
by: Wang, Yifan, et al.
Published: (2025) -
Target-Oriented Pretraining Data Selection via Neuron-Activated Graph
by: Wang, Zijun, et al.
Published: (2026) -
Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining
by: Guo, Ping, et al.
Published: (2025) -
MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining
by: Chen, Zhixun, et al.
Published: (2025)