:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Parmar, Jupinder, Prabhumoye, Shrimai, Jennings, Joseph, Patwary, Mostofa, Subramanian, Sandeep, Su, Dan, Zhu, Chen, Narayanan, Deepak, Jhunjhunwala, Aastha, Dattagupta, Ayush, Jawa, Vibhu, Liu, Jiwei, Mahabaleshwarkar, Ameya, Nitski, Osvald, Brundyn, Annika, Maki, James, Martinez, Miguel, You, Jiaxuan, Kamalu, John, LeGresley, Patrick, Fridman, Denys, Casper, Jared, Aithal, Ashwath, Kuchaiev, Oleksii, Shoeybi, Mohammad, Cohen, Jonathan, Catanzaro, Bryan
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computation and Language Artificial Intelligence Machine Learning
Online-Zugang:	https://arxiv.org/abs/2402.16819
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Data, Data Everywhere: A Guide for Pretraining Dataset Construction
von: Parmar, Jupinder, et al.
Veröffentlicht: (2024)

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset
von: Mahabadi, Rabeeh Karimi, et al.
Veröffentlicht: (2025)

MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
von: Akter, Syeda Nahida, et al.
Veröffentlicht: (2024)

Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
von: Feng, Steven, et al.
Veröffentlicht: (2024)

Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
von: Akter, Syeda Nahida, et al.
Veröffentlicht: (2025)

Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs
von: Taghibakhshi, Ali, et al.
Veröffentlicht: (2025)

Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models
von: Parmar, Jupinder, et al.
Veröffentlicht: (2024)

Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data
von: Akter, Syeda Nahida, et al.
Veröffentlicht: (2025)

Nemotron-4 340B Technical Report
von: Nvidia, et al.
Veröffentlicht: (2024)

RLP: Reinforcement as a Pretraining Objective
von: Hatamizadeh, Ali, et al.
Veröffentlicht: (2025)

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
von: Su, Dan, et al.
Veröffentlicht: (2024)

Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning
von: Lu, Ximing, et al.
Veröffentlicht: (2025)

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
von: Jung, Jaehun, et al.
Veröffentlicht: (2025)

Minitron-SSM: Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
von: Taghibakhshi, Ali, et al.
Veröffentlicht: (2025)

LLM Pruning and Distillation in Practice: The Minitron Approach
von: Sreenivas, Sharath Turuvekere, et al.
Veröffentlicht: (2024)

Student gender modulates the intersection of calculus proficiency and calculus self-efficacy in an introductory electricity and magnetism course
von: Fischer, Christopher J., et al.
Veröffentlicht: (2024)

FusionFactory: Fusing LLM Capabilities with Multi-LLM Log Data
von: Feng, Tao, et al.
Veröffentlicht: (2025)

Compact Language Models via Pruning and Knowledge Distillation
von: Muralidharan, Saurav, et al.
Veröffentlicht: (2024)

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
von: Chen, Yang, et al.
Veröffentlicht: (2025)

Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression
von: Tasnim, Nazia, et al.
Veröffentlicht: (2026)

AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy
von: Liu, Zihan, et al.
Veröffentlicht: (2025)

Upcycling Large Language Models into Mixture of Experts
von: He, Ethan, et al.
Veröffentlicht: (2024)

Llama-Nemotron: Efficient Reasoning Models
von: Bercovich, Akhiad, et al.
Veröffentlicht: (2025)

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
von: Wang, Boxin, et al.
Veröffentlicht: (2025)

When2Call: When (not) to Call Tools
von: Ross, Hayley, et al.
Veröffentlicht: (2025)

nach0: Multimodal Natural and Chemical Languages Foundation Model
von: Livne, Micha, et al.
Veröffentlicht: (2023)

NVIDIA Nemotron Nano V2 VL
von: NVIDIA, et al.
Veröffentlicht: (2025)

Nemotron-CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
von: Diao, Shizhe, et al.
Veröffentlicht: (2025)

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
von: NVIDIA, et al.
Veröffentlicht: (2025)

The AI Consumer Index (ACE)
von: Benchek, Julien, et al.
Veröffentlicht: (2025)

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
von: Yang, Zhuolin, et al.
Veröffentlicht: (2026)

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
von: NVIDIA, et al.
Veröffentlicht: (2025)

Corrupción entre particulares: lesividad de la conducta y consecuencias en sede de tipificación de acuerdo al análisis comparado
von: Osvald Artaza Varela
Veröffentlicht: (2019)

NVIDIA Nemotron 3: Efficient and Open Intelligence
von: NVIDIA, et al.
Veröffentlicht: (2025)

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
von: Liu, Zihan, et al.
Veröffentlicht: (2024)

iGRPO: Self-Feedback-Driven LLM Reasoning
von: Hatamizadeh, Ali, et al.
Veröffentlicht: (2026)

Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
von: Zhang, Shaokun, et al.
Veröffentlicht: (2025)

NVIDIA Nemotron Parse 1.1
von: Chumachenko, Kateryna, et al.
Veröffentlicht: (2025)

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
von: Shen, Gerald, et al.
Veröffentlicht: (2024)

On Data Engineering for Scaling LLM Terminal Capabilities
von: Pi, Renjie, et al.
Veröffentlicht: (2026)