Saved in:
| Main Authors: | Li, Ruihang, Wei, Yixuan, Zhang, Miaosen, Yu, Nenghai, Hu, Han, Peng, Houwen |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.08310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Xwin-LM: Strong and Scalable Alignment Practice for LLMs
by: Ni, Bolin, et al.
Published: (2024)
by: Ni, Bolin, et al.
Published: (2024)
Common 7B Language Models Already Possess Strong Math Capabilities
by: Li, Chen, et al.
Published: (2024)
by: Li, Chen, et al.
Published: (2024)
InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition
by: Liu, Fengze, et al.
Published: (2026)
by: Liu, Fengze, et al.
Published: (2026)
LaMPE: Length-aware Multi-grained Positional Encoding for Adaptive Long-context Scaling Without Training
by: Zhang, Sikui, et al.
Published: (2025)
by: Zhang, Sikui, et al.
Published: (2025)
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
by: Gu, Shuhao, et al.
Published: (2024)
by: Gu, Shuhao, et al.
Published: (2024)
Scaling Laws of Synthetic Data for Language Models
by: Qin, Zeyu, et al.
Published: (2025)
by: Qin, Zeyu, et al.
Published: (2025)
Uncovering Scaling Laws for Large Language Models via Inverse Problems
by: Verma, Arun, et al.
Published: (2025)
by: Verma, Arun, et al.
Published: (2025)
Scaling Laws for Code: A More Data-Hungry Regime
by: Luo, Xianzhen, et al.
Published: (2025)
by: Luo, Xianzhen, et al.
Published: (2025)
SmolKalam: Ensemble Quality-Filtered Translation at Scale for High Quality Arabic Post-Training Data
by: Alrashed, Sultan, et al.
Published: (2025)
by: Alrashed, Sultan, et al.
Published: (2025)
Flora: Effortless Context Construction to Arbitrary Length and Scale
by: Chen, Tianxiang, et al.
Published: (2025)
by: Chen, Tianxiang, et al.
Published: (2025)
The Scaling Laws of Skills in LLM Agent Systems
by: Chen, Charles, et al.
Published: (2026)
by: Chen, Charles, et al.
Published: (2026)
Temporal Scaling Law for Large Language Models
by: Xiong, Yizhe, et al.
Published: (2024)
by: Xiong, Yizhe, et al.
Published: (2024)
Neural Neural Scaling Laws
by: Hu, Michael Y., et al.
Published: (2026)
by: Hu, Michael Y., et al.
Published: (2026)
What Scales in Cross-Entropy Scaling Law?
by: Yan, Junxi, et al.
Published: (2025)
by: Yan, Junxi, et al.
Published: (2025)
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
by: Zeng, Liang, et al.
Published: (2025)
by: Zeng, Liang, et al.
Published: (2025)
P$^2$ Law: Scaling Law for Post-Training After Model Pruning
by: Chen, Xiaodong, et al.
Published: (2024)
by: Chen, Xiaodong, et al.
Published: (2024)
gzip Predicts Data-dependent Scaling Laws
by: Pandey, Rohan
Published: (2024)
by: Pandey, Rohan
Published: (2024)
Prescriptive Scaling Laws for Data Constrained Training
by: Lovelace, Justin, et al.
Published: (2026)
by: Lovelace, Justin, et al.
Published: (2026)
Scaling Laws for Code: Every Programming Language Matters
by: Yang, Jian, et al.
Published: (2025)
by: Yang, Jian, et al.
Published: (2025)
Exploring Efficiency Frontiers of Thinking Budget in Medical Reasoning: Scaling Laws between Computational Resources and Reasoning Quality
by: Bi, Ziqian, et al.
Published: (2025)
by: Bi, Ziqian, et al.
Published: (2025)
Parallel Scaling Law: Unveiling Reasoning Generalization through A Cross-Linguistic Perspective
by: Yang, Wen, et al.
Published: (2025)
by: Yang, Wen, et al.
Published: (2025)
Scaling Laws for Speculative Decoding
by: Yan, Siyuan, et al.
Published: (2025)
by: Yan, Siyuan, et al.
Published: (2025)
Scaling Laws for Multilingual Language Models
by: He, Yifei, et al.
Published: (2024)
by: He, Yifei, et al.
Published: (2024)
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
by: Zeng, Liang, et al.
Published: (2024)
by: Zeng, Liang, et al.
Published: (2024)
Scaling Laws for Floating Point Quantization Training
by: Sun, Xingwu, et al.
Published: (2025)
by: Sun, Xingwu, et al.
Published: (2025)
Relative Scaling Laws for LLMs
by: Held, William, et al.
Published: (2025)
by: Held, William, et al.
Published: (2025)
Scaling Laws For Mixed Quantization
by: Cao, Zeyu, et al.
Published: (2024)
by: Cao, Zeyu, et al.
Published: (2024)
Scaling Laws for Precision
by: Kumar, Tanishq, et al.
Published: (2024)
by: Kumar, Tanishq, et al.
Published: (2024)
Scaling Laws for Mixture Pretraining Under Data Constraints
by: Sedova, Anastasiia, et al.
Published: (2026)
by: Sedova, Anastasiia, et al.
Published: (2026)
Scaling Law for Quantization-Aware Training
by: Chen, Mengzhao, et al.
Published: (2025)
by: Chen, Mengzhao, et al.
Published: (2025)
Scaling Laws for Linear Complexity Language Models
by: Shen, Xuyang, et al.
Published: (2024)
by: Shen, Xuyang, et al.
Published: (2024)
Scale-Free Graph-Language Models
by: Lu, Jianglin, et al.
Published: (2025)
by: Lu, Jianglin, et al.
Published: (2025)
Scaling Laws in Scientific Discovery with AI and Robot Scientists
by: Zhang, Pengsong, et al.
Published: (2025)
by: Zhang, Pengsong, et al.
Published: (2025)
The Limits of Data Scaling: Sub-token Utilization and Acoustic Saturation in Multilingual ASR
by: Liang, Siyu, et al.
Published: (2025)
by: Liang, Siyu, et al.
Published: (2025)
Scaling Parameter-Constrained Language Models with Quality Data
by: Chang, Ernie, et al.
Published: (2024)
by: Chang, Ernie, et al.
Published: (2024)
Exploring Training and Inference Scaling Laws in Generative Retrieval
by: Cai, Hongru, et al.
Published: (2025)
by: Cai, Hongru, et al.
Published: (2025)
Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection
by: Bethune, Louis, et al.
Published: (2025)
by: Bethune, Louis, et al.
Published: (2025)
Scaling Laws For Dense Retrieval
by: Fang, Yan, et al.
Published: (2024)
by: Fang, Yan, et al.
Published: (2024)
Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check
by: Lourie, Nicholas, et al.
Published: (2025)
by: Lourie, Nicholas, et al.
Published: (2025)
Distillation Scaling Laws
by: Busbridge, Dan, et al.
Published: (2025)
by: Busbridge, Dan, et al.
Published: (2025)
Similar Items
-
Xwin-LM: Strong and Scalable Alignment Practice for LLMs
by: Ni, Bolin, et al.
Published: (2024) -
Common 7B Language Models Already Possess Strong Math Capabilities
by: Li, Chen, et al.
Published: (2024) -
InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition
by: Liu, Fengze, et al.
Published: (2026) -
LaMPE: Length-aware Multi-grained Positional Encoding for Adaptive Long-context Scaling Without Training
by: Zhang, Sikui, et al.
Published: (2025) -
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
by: Gu, Shuhao, et al.
Published: (2024)