Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Thawakar, Omkar, Vayani, Ashmal, Khan, Salman, Cholakal, Hisham, Anwer, Rao M., Felsberg, Michael, Baldwin, Tim, Xing, Eric P., Khan, Fahad Shahbaz
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2402.16840
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866910344332443648
author	Thawakar, Omkar Vayani, Ashmal Khan, Salman Cholakal, Hisham Anwer, Rao M. Felsberg, Michael Baldwin, Tim Xing, Eric P. Khan, Fahad Shahbaz
author_facet	Thawakar, Omkar Vayani, Ashmal Khan, Salman Cholakal, Hisham Anwer, Rao M. Felsberg, Michael Baldwin, Tim Xing, Eric P. Khan, Fahad Shahbaz
contents	"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. However, LLMs do not suit well for scenarios that require on-device processing, energy efficiency, low memory footprint, and response efficiency. These requisites are crucial for privacy, security, and sustainable deployment. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices. Our primary contribution is the introduction of an accurate and fully transparent open-source 0.5 billion (0.5B) parameter SLM, named MobiLlama, catering to the specific needs of resource-constrained computing with an emphasis on enhanced performance with reduced resource demands. MobiLlama is a SLM design that initiates from a larger model and applies a careful parameter sharing scheme to reduce both the pre-training and the deployment cost. Our work strives to not only bridge the gap in open-source SLMs but also ensures full transparency, where complete training data pipeline, training code, model weights, and over 300 checkpoints along with evaluation codes is available at : https://github.com/mbzuai-oryx/MobiLlama.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_16840
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT Thawakar, Omkar Vayani, Ashmal Khan, Salman Cholakal, Hisham Anwer, Rao M. Felsberg, Michael Baldwin, Tim Xing, Eric P. Khan, Fahad Shahbaz Computation and Language "Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. However, LLMs do not suit well for scenarios that require on-device processing, energy efficiency, low memory footprint, and response efficiency. These requisites are crucial for privacy, security, and sustainable deployment. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices. Our primary contribution is the introduction of an accurate and fully transparent open-source 0.5 billion (0.5B) parameter SLM, named MobiLlama, catering to the specific needs of resource-constrained computing with an emphasis on enhanced performance with reduced resource demands. MobiLlama is a SLM design that initiates from a larger model and applies a careful parameter sharing scheme to reduce both the pre-training and the deployment cost. Our work strives to not only bridge the gap in open-source SLMs but also ensures full transparency, where complete training data pipeline, training code, model weights, and over 300 checkpoints along with evaluation codes is available at : https://github.com/mbzuai-oryx/MobiLlama.
title	MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
topic	Computation and Language
url	https://arxiv.org/abs/2402.16840

Ähnliche Einträge