Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.08295 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914756254760960 |
|---|---|
| author | Gemma Team Mesnard, Thomas Hardin, Cassidy Dadashi, Robert Bhupatiraju, Surya Pathak, Shreya Sifre, Laurent Rivière, Morgane Kale, Mihir Sanjay Love, Juliette Tafti, Pouya Hussenot, Léonard Sessa, Pier Giuseppe Chowdhery, Aakanksha Roberts, Adam Barua, Aditya Botev, Alex Castro-Ros, Alex Slone, Ambrose Héliou, Amélie Tacchetti, Andrea Bulanova, Anna Paterson, Antonia Tsai, Beth Shahriari, Bobak Lan, Charline Le Choquette-Choo, Christopher A. Crepy, Clément Cer, Daniel Ippolito, Daphne Reid, David Buchatskaya, Elena Ni, Eric Noland, Eric Yan, Geng Tucker, George Muraru, George-Christian Rozhdestvenskiy, Grigory Michalewski, Henryk Tenney, Ian Grishchenko, Ivan Austin, Jacob Keeling, James Labanowski, Jane Lespiau, Jean-Baptiste Stanway, Jeff Brennan, Jenny Chen, Jeremy Ferret, Johan Chiu, Justin Mao-Jones, Justin Lee, Katherine Yu, Kathy Millican, Katie Sjoesund, Lars Lowe Lee, Lisa Dixon, Lucas Reid, Machel Mikuła, Maciej Wirth, Mateo Sharman, Michael Chinaev, Nikolai Thain, Nithum Bachem, Olivier Chang, Oscar Wahltinez, Oscar Bailey, Paige Michel, Paul Yotov, Petko Chaabouni, Rahma Comanescu, Ramona Jana, Reena Anil, Rohan McIlroy, Ross Liu, Ruibo Mullins, Ryan Smith, Samuel L Borgeaud, Sebastian Girgin, Sertan Douglas, Sholto Pandya, Shree Shakeri, Siamak De, Soham Klimenko, Ted Hennigan, Tom Feinberg, Vlad Stokowiec, Wojciech Chen, Yu-hui Ahmed, Zafarali Gong, Zhitao Warkentin, Tris Peran, Ludovic Giang, Minh Farabet, Clément Vinyals, Oriol Dean, Jeff Kavukcuoglu, Koray Hassabis, Demis Ghahramani, Zoubin Eck, Douglas Barral, Joelle Pereira, Fernando Collins, Eli Joulin, Armand Fiedel, Noah Senter, Evan Andreev, Alek Kenealy, Kathleen |
| author_facet | Gemma Team Mesnard, Thomas Hardin, Cassidy Dadashi, Robert Bhupatiraju, Surya Pathak, Shreya Sifre, Laurent Rivière, Morgane Kale, Mihir Sanjay Love, Juliette Tafti, Pouya Hussenot, Léonard Sessa, Pier Giuseppe Chowdhery, Aakanksha Roberts, Adam Barua, Aditya Botev, Alex Castro-Ros, Alex Slone, Ambrose Héliou, Amélie Tacchetti, Andrea Bulanova, Anna Paterson, Antonia Tsai, Beth Shahriari, Bobak Lan, Charline Le Choquette-Choo, Christopher A. Crepy, Clément Cer, Daniel Ippolito, Daphne Reid, David Buchatskaya, Elena Ni, Eric Noland, Eric Yan, Geng Tucker, George Muraru, George-Christian Rozhdestvenskiy, Grigory Michalewski, Henryk Tenney, Ian Grishchenko, Ivan Austin, Jacob Keeling, James Labanowski, Jane Lespiau, Jean-Baptiste Stanway, Jeff Brennan, Jenny Chen, Jeremy Ferret, Johan Chiu, Justin Mao-Jones, Justin Lee, Katherine Yu, Kathy Millican, Katie Sjoesund, Lars Lowe Lee, Lisa Dixon, Lucas Reid, Machel Mikuła, Maciej Wirth, Mateo Sharman, Michael Chinaev, Nikolai Thain, Nithum Bachem, Olivier Chang, Oscar Wahltinez, Oscar Bailey, Paige Michel, Paul Yotov, Petko Chaabouni, Rahma Comanescu, Ramona Jana, Reena Anil, Rohan McIlroy, Ross Liu, Ruibo Mullins, Ryan Smith, Samuel L Borgeaud, Sebastian Girgin, Sertan Douglas, Sholto Pandya, Shree Shakeri, Siamak De, Soham Klimenko, Ted Hennigan, Tom Feinberg, Vlad Stokowiec, Wojciech Chen, Yu-hui Ahmed, Zafarali Gong, Zhitao Warkentin, Tris Peran, Ludovic Giang, Minh Farabet, Clément Vinyals, Oriol Dean, Jeff Kavukcuoglu, Koray Hassabis, Demis Ghahramani, Zoubin Eck, Douglas Barral, Joelle Pereira, Fernando Collins, Eli Joulin, Armand Fiedel, Noah Senter, Evan Andreev, Alek Kenealy, Kathleen |
| contents | This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2403_08295 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Gemma: Open Models Based on Gemini Research and Technology Gemma Team Mesnard, Thomas Hardin, Cassidy Dadashi, Robert Bhupatiraju, Surya Pathak, Shreya Sifre, Laurent Rivière, Morgane Kale, Mihir Sanjay Love, Juliette Tafti, Pouya Hussenot, Léonard Sessa, Pier Giuseppe Chowdhery, Aakanksha Roberts, Adam Barua, Aditya Botev, Alex Castro-Ros, Alex Slone, Ambrose Héliou, Amélie Tacchetti, Andrea Bulanova, Anna Paterson, Antonia Tsai, Beth Shahriari, Bobak Lan, Charline Le Choquette-Choo, Christopher A. Crepy, Clément Cer, Daniel Ippolito, Daphne Reid, David Buchatskaya, Elena Ni, Eric Noland, Eric Yan, Geng Tucker, George Muraru, George-Christian Rozhdestvenskiy, Grigory Michalewski, Henryk Tenney, Ian Grishchenko, Ivan Austin, Jacob Keeling, James Labanowski, Jane Lespiau, Jean-Baptiste Stanway, Jeff Brennan, Jenny Chen, Jeremy Ferret, Johan Chiu, Justin Mao-Jones, Justin Lee, Katherine Yu, Kathy Millican, Katie Sjoesund, Lars Lowe Lee, Lisa Dixon, Lucas Reid, Machel Mikuła, Maciej Wirth, Mateo Sharman, Michael Chinaev, Nikolai Thain, Nithum Bachem, Olivier Chang, Oscar Wahltinez, Oscar Bailey, Paige Michel, Paul Yotov, Petko Chaabouni, Rahma Comanescu, Ramona Jana, Reena Anil, Rohan McIlroy, Ross Liu, Ruibo Mullins, Ryan Smith, Samuel L Borgeaud, Sebastian Girgin, Sertan Douglas, Sholto Pandya, Shree Shakeri, Siamak De, Soham Klimenko, Ted Hennigan, Tom Feinberg, Vlad Stokowiec, Wojciech Chen, Yu-hui Ahmed, Zafarali Gong, Zhitao Warkentin, Tris Peran, Ludovic Giang, Minh Farabet, Clément Vinyals, Oriol Dean, Jeff Kavukcuoglu, Koray Hassabis, Demis Ghahramani, Zoubin Eck, Douglas Barral, Joelle Pereira, Fernando Collins, Eli Joulin, Armand Fiedel, Noah Senter, Evan Andreev, Alek Kenealy, Kathleen Computation and Language Artificial Intelligence This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations. |
| title | Gemma: Open Models Based on Gemini Research and Technology |
| topic | Computation and Language Artificial Intelligence |
| url | https://arxiv.org/abs/2403.08295 |