Saved in:
Bibliographic Details
Main Authors: Nvidia, :, Adler, Bo, Agarwal, Niket, Aithal, Ashwath, Anh, Dong H., Bhattacharya, Pallab, Brundyn, Annika, Casper, Jared, Catanzaro, Bryan, Clay, Sharon, Cohen, Jonathan, Das, Sirshak, Dattagupta, Ayush, Delalleau, Olivier, Derczynski, Leon, Dong, Yi, Egert, Daniel, Evans, Ellie, Ficek, Aleksander, Fridman, Denys, Ghosh, Shaona, Ginsburg, Boris, Gitman, Igor, Grzegorzek, Tomasz, Hero, Robert, Huang, Jining, Jawa, Vibhu, Jennings, Joseph, Jhunjhunwala, Aastha, Kamalu, John, Khan, Sadaf, Kuchaiev, Oleksii, LeGresley, Patrick, Li, Hui, Liu, Jiwei, Liu, Zihan, Long, Eileen, Mahabaleshwarkar, Ameya Sunil, Majumdar, Somshubra, Maki, James, Martinez, Miguel, de Melo, Maer Rodrigues, Moshkov, Ivan, Narayanan, Deepak, Narenthiran, Sean, Navarro, Jesus, Nguyen, Phong, Nitski, Osvald, Noroozi, Vahid, Nutheti, Guruprasad, Parisien, Christopher, Parmar, Jupinder, Patwary, Mostofa, Pawelec, Krzysztof, Ping, Wei, Prabhumoye, Shrimai, Roy, Rajarshi, Saar, Trisha, Sabavat, Vasanth Rao Naik, Satheesh, Sanjeev, Scowcroft, Jane Polak, Sewall, Jason, Shamis, Pavel, Shen, Gerald, Shoeybi, Mohammad, Sizer, Dave, Smelyanskiy, Misha, Soares, Felipe, Sreedhar, Makesh Narsimhan, Su, Dan, Subramanian, Sandeep, Sun, Shengyang, Toshniwal, Shubham, Wang, Hao, Wang, Zhilin, You, Jiaxuan, Zeng, Jiaqi, Zhang, Jimmy, Zhang, Jing, Zhang, Vivienne, Zhang, Yian, Zhu, Chen
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.11704
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913460340654080
author Nvidia
:
Adler, Bo
Agarwal, Niket
Aithal, Ashwath
Anh, Dong H.
Bhattacharya, Pallab
Brundyn, Annika
Casper, Jared
Catanzaro, Bryan
Clay, Sharon
Cohen, Jonathan
Das, Sirshak
Dattagupta, Ayush
Delalleau, Olivier
Derczynski, Leon
Dong, Yi
Egert, Daniel
Evans, Ellie
Ficek, Aleksander
Fridman, Denys
Ghosh, Shaona
Ginsburg, Boris
Gitman, Igor
Grzegorzek, Tomasz
Hero, Robert
Huang, Jining
Jawa, Vibhu
Jennings, Joseph
Jhunjhunwala, Aastha
Kamalu, John
Khan, Sadaf
Kuchaiev, Oleksii
LeGresley, Patrick
Li, Hui
Liu, Jiwei
Liu, Zihan
Long, Eileen
Mahabaleshwarkar, Ameya Sunil
Majumdar, Somshubra
Maki, James
Martinez, Miguel
de Melo, Maer Rodrigues
Moshkov, Ivan
Narayanan, Deepak
Narenthiran, Sean
Navarro, Jesus
Nguyen, Phong
Nitski, Osvald
Noroozi, Vahid
Nutheti, Guruprasad
Parisien, Christopher
Parmar, Jupinder
Patwary, Mostofa
Pawelec, Krzysztof
Ping, Wei
Prabhumoye, Shrimai
Roy, Rajarshi
Saar, Trisha
Sabavat, Vasanth Rao Naik
Satheesh, Sanjeev
Scowcroft, Jane Polak
Sewall, Jason
Shamis, Pavel
Shen, Gerald
Shoeybi, Mohammad
Sizer, Dave
Smelyanskiy, Misha
Soares, Felipe
Sreedhar, Makesh Narsimhan
Su, Dan
Subramanian, Sandeep
Sun, Shengyang
Toshniwal, Shubham
Wang, Hao
Wang, Zhilin
You, Jiaxuan
Zeng, Jiaqi
Zhang, Jimmy
Zhang, Jing
Zhang, Vivienne
Zhang, Yian
Zhu, Chen
author_facet Nvidia
:
Adler, Bo
Agarwal, Niket
Aithal, Ashwath
Anh, Dong H.
Bhattacharya, Pallab
Brundyn, Annika
Casper, Jared
Catanzaro, Bryan
Clay, Sharon
Cohen, Jonathan
Das, Sirshak
Dattagupta, Ayush
Delalleau, Olivier
Derczynski, Leon
Dong, Yi
Egert, Daniel
Evans, Ellie
Ficek, Aleksander
Fridman, Denys
Ghosh, Shaona
Ginsburg, Boris
Gitman, Igor
Grzegorzek, Tomasz
Hero, Robert
Huang, Jining
Jawa, Vibhu
Jennings, Joseph
Jhunjhunwala, Aastha
Kamalu, John
Khan, Sadaf
Kuchaiev, Oleksii
LeGresley, Patrick
Li, Hui
Liu, Jiwei
Liu, Zihan
Long, Eileen
Mahabaleshwarkar, Ameya Sunil
Majumdar, Somshubra
Maki, James
Martinez, Miguel
de Melo, Maer Rodrigues
Moshkov, Ivan
Narayanan, Deepak
Narenthiran, Sean
Navarro, Jesus
Nguyen, Phong
Nitski, Osvald
Noroozi, Vahid
Nutheti, Guruprasad
Parisien, Christopher
Parmar, Jupinder
Patwary, Mostofa
Pawelec, Krzysztof
Ping, Wei
Prabhumoye, Shrimai
Roy, Rajarshi
Saar, Trisha
Sabavat, Vasanth Rao Naik
Satheesh, Sanjeev
Scowcroft, Jane Polak
Sewall, Jason
Shamis, Pavel
Shen, Gerald
Shoeybi, Mohammad
Sizer, Dave
Smelyanskiy, Misha
Soares, Felipe
Sreedhar, Makesh Narsimhan
Su, Dan
Subramanian, Sandeep
Sun, Shengyang
Toshniwal, Shubham
Wang, Hao
Wang, Zhilin
You, Jiaxuan
Zeng, Jiaqi
Zhang, Jimmy
Zhang, Jing
Zhang, Vivienne
Zhang, Yian
Zhu, Chen
contents We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision. We believe that the community can benefit from these models in various research studies and commercial applications, especially for generating synthetic data to train smaller language models. Notably, over 98% of data used in our model alignment process is synthetically generated, showcasing the effectiveness of these models in generating synthetic data. To further support open research and facilitate model development, we are also open-sourcing the synthetic data generation pipeline used in our model alignment process.
format Preprint
id arxiv_https___arxiv_org_abs_2406_11704
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Nemotron-4 340B Technical Report
Nvidia
:
Adler, Bo
Agarwal, Niket
Aithal, Ashwath
Anh, Dong H.
Bhattacharya, Pallab
Brundyn, Annika
Casper, Jared
Catanzaro, Bryan
Clay, Sharon
Cohen, Jonathan
Das, Sirshak
Dattagupta, Ayush
Delalleau, Olivier
Derczynski, Leon
Dong, Yi
Egert, Daniel
Evans, Ellie
Ficek, Aleksander
Fridman, Denys
Ghosh, Shaona
Ginsburg, Boris
Gitman, Igor
Grzegorzek, Tomasz
Hero, Robert
Huang, Jining
Jawa, Vibhu
Jennings, Joseph
Jhunjhunwala, Aastha
Kamalu, John
Khan, Sadaf
Kuchaiev, Oleksii
LeGresley, Patrick
Li, Hui
Liu, Jiwei
Liu, Zihan
Long, Eileen
Mahabaleshwarkar, Ameya Sunil
Majumdar, Somshubra
Maki, James
Martinez, Miguel
de Melo, Maer Rodrigues
Moshkov, Ivan
Narayanan, Deepak
Narenthiran, Sean
Navarro, Jesus
Nguyen, Phong
Nitski, Osvald
Noroozi, Vahid
Nutheti, Guruprasad
Parisien, Christopher
Parmar, Jupinder
Patwary, Mostofa
Pawelec, Krzysztof
Ping, Wei
Prabhumoye, Shrimai
Roy, Rajarshi
Saar, Trisha
Sabavat, Vasanth Rao Naik
Satheesh, Sanjeev
Scowcroft, Jane Polak
Sewall, Jason
Shamis, Pavel
Shen, Gerald
Shoeybi, Mohammad
Sizer, Dave
Smelyanskiy, Misha
Soares, Felipe
Sreedhar, Makesh Narsimhan
Su, Dan
Subramanian, Sandeep
Sun, Shengyang
Toshniwal, Shubham
Wang, Hao
Wang, Zhilin
You, Jiaxuan
Zeng, Jiaqi
Zhang, Jimmy
Zhang, Jing
Zhang, Vivienne
Zhang, Yian
Zhu, Chen
Computation and Language
Artificial Intelligence
Machine Learning
We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision. We believe that the community can benefit from these models in various research studies and commercial applications, especially for generating synthetic data to train smaller language models. Notably, over 98% of data used in our model alignment process is synthetically generated, showcasing the effectiveness of these models in generating synthetic data. To further support open research and facilitate model development, we are also open-sourcing the synthetic data generation pipeline used in our model alignment process.
title Nemotron-4 340B Technical Report
topic Computation and Language
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2406.11704