Guardado en:
| Autores principales: | Zhao, Ji, Gu, Yufei, Shao, Shitong, Zhou, Xun, Xiang, Liang, Xie, Zeke |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2602.05393 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Mano: Restriking Manifold Optimization for LLM Training
por: Gu, Yufei, et al.
Publicado: (2026)
por: Gu, Yufei, et al.
Publicado: (2026)
FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters
por: Shao, Shitong, et al.
Publicado: (2026)
por: Shao, Shitong, et al.
Publicado: (2026)
Dimension-Free Saddle-Point Escape in Muon
por: Long, Yanlin, et al.
Publicado: (2026)
por: Long, Yanlin, et al.
Publicado: (2026)
DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation
por: Shen, Zhiqiang, et al.
Publicado: (2024)
por: Shen, Zhiqiang, et al.
Publicado: (2024)
Accelerating Diffusion Model Training under Minimal Budgets: A Condensation-Based Perspective
por: Huang, Rui, et al.
Publicado: (2025)
por: Huang, Rui, et al.
Publicado: (2025)
Golden Noise for Diffusion Models: A Learning Framework
por: Zhou, Zikai, et al.
Publicado: (2024)
por: Zhou, Zikai, et al.
Publicado: (2024)
Bag of Design Choices for Inference of High-Resolution Masked Generative Transformer
por: Shao, Shitong, et al.
Publicado: (2024)
por: Shao, Shitong, et al.
Publicado: (2024)
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster
por: Shao, Shitong, et al.
Publicado: (2025)
por: Shao, Shitong, et al.
Publicado: (2025)
CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think
por: Sun, Zening, et al.
Publicado: (2026)
por: Sun, Zening, et al.
Publicado: (2026)
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
por: Pouransari, Hadi, et al.
Publicado: (2024)
por: Pouransari, Hadi, et al.
Publicado: (2024)
Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training
por: Shi, Hengyu, et al.
Publicado: (2026)
por: Shi, Hengyu, et al.
Publicado: (2026)
Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection
por: Bai, Lichen, et al.
Publicado: (2024)
por: Bai, Lichen, et al.
Publicado: (2024)
Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning
por: Dönder, Yusuf Denizay, et al.
Publicado: (2025)
por: Dönder, Yusuf Denizay, et al.
Publicado: (2025)
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
por: Gao, Chengqian, et al.
Publicado: (2025)
por: Gao, Chengqian, et al.
Publicado: (2025)
Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph
por: Liang, Xujian, et al.
Publicado: (2025)
por: Liang, Xujian, et al.
Publicado: (2025)
Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
por: Shi, Yubin, et al.
Publicado: (2024)
por: Shi, Yubin, et al.
Publicado: (2024)
HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
por: Chen, Junying, et al.
Publicado: (2023)
por: Chen, Junying, et al.
Publicado: (2023)
Optimizing Few-Step Generation with Adaptive Matching Distillation
por: Bai, Lichen, et al.
Publicado: (2026)
por: Bai, Lichen, et al.
Publicado: (2026)
Ground-Truth Subgraphs for Better Training and Evaluation of Knowledge Graph Augmented LLMs
por: Cattaneo, Alberto, et al.
Publicado: (2025)
por: Cattaneo, Alberto, et al.
Publicado: (2025)
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
por: Sung, Yi-Lin, et al.
Publicado: (2025)
por: Sung, Yi-Lin, et al.
Publicado: (2025)
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
por: Wang, Xiaolei, et al.
Publicado: (2024)
por: Wang, Xiaolei, et al.
Publicado: (2024)
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
por: Wang, Yibo, et al.
Publicado: (2026)
por: Wang, Yibo, et al.
Publicado: (2026)
Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
por: Ahmadi, Saba, et al.
Publicado: (2026)
por: Ahmadi, Saba, et al.
Publicado: (2026)
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
por: Shen, Xuan, et al.
Publicado: (2023)
por: Shen, Xuan, et al.
Publicado: (2023)
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
por: Chavan, Arnav, et al.
Publicado: (2024)
por: Chavan, Arnav, et al.
Publicado: (2024)
Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs
por: Li, Xueyan, et al.
Publicado: (2025)
por: Li, Xueyan, et al.
Publicado: (2025)
SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)
por: Meeus, Matthieu, et al.
Publicado: (2024)
por: Meeus, Matthieu, et al.
Publicado: (2024)
Exploration Hacking: Can LLMs Learn to Resist RL Training?
por: Jang, Eyon, et al.
Publicado: (2026)
por: Jang, Eyon, et al.
Publicado: (2026)
Stream of Search (SoS): Learning to Search in Language
por: Gandhi, Kanishk, et al.
Publicado: (2024)
por: Gandhi, Kanishk, et al.
Publicado: (2024)
SoK: Machine Learning for Misinformation Detection
por: Xiao, Madelyne, et al.
Publicado: (2023)
por: Xiao, Madelyne, et al.
Publicado: (2023)
Learning to Rewrite Prompts for Bootstrapping LLMs on Downstream Tasks
por: Zhou, Qinhao, et al.
Publicado: (2025)
por: Zhou, Qinhao, et al.
Publicado: (2025)
Does Biomedical Training Lead to Better Medical Performance?
por: Dada, Amin, et al.
Publicado: (2024)
por: Dada, Amin, et al.
Publicado: (2024)
TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs
por: Gu, Yuxuan, et al.
Publicado: (2025)
por: Gu, Yuxuan, et al.
Publicado: (2025)
PEARL: Towards Permutation-Resilient LLMs
por: Chen, Liang, et al.
Publicado: (2025)
por: Chen, Liang, et al.
Publicado: (2025)
Faster MoE LLM Inference for Extremely Large Models
por: Yang, Haoqi, et al.
Publicado: (2025)
por: Yang, Haoqi, et al.
Publicado: (2025)
Each Graph is a New Language: Graph Learning with LLMs
por: Zhou, Huachi, et al.
Publicado: (2025)
por: Zhou, Huachi, et al.
Publicado: (2025)
xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning
por: Qian, Cheng, et al.
Publicado: (2025)
por: Qian, Cheng, et al.
Publicado: (2025)
Synthetic Sandbox for Training Machine Learning Engineering Agents
por: Zhou, Yuhang, et al.
Publicado: (2026)
por: Zhou, Yuhang, et al.
Publicado: (2026)
Better To Ask in English? Evaluating Factual Accuracy of Multilingual LLMs in English and Low-Resource Languages
por: Rohera, Pritika, et al.
Publicado: (2025)
por: Rohera, Pritika, et al.
Publicado: (2025)
Design Principle Transfer in Neural Architecture Search via Large Language Models
por: Zhou, Xun, et al.
Publicado: (2024)
por: Zhou, Xun, et al.
Publicado: (2024)
Ejemplares similares
-
Mano: Restriking Manifold Optimization for LLM Training
por: Gu, Yufei, et al.
Publicado: (2026) -
FastLightGen: Fast and Light Video Generation with Fewer Steps and Parameters
por: Shao, Shitong, et al.
Publicado: (2026) -
Dimension-Free Saddle-Point Escape in Muon
por: Long, Yanlin, et al.
Publicado: (2026) -
DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation
por: Shen, Zhiqiang, et al.
Publicado: (2024) -
Accelerating Diffusion Model Training under Minimal Budgets: A Condensation-Based Perspective
por: Huang, Rui, et al.
Publicado: (2025)