Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jayakody, Ravindu, Dias, Gihan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2407.21330
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911973560549376
author	Jayakody, Ravindu Dias, Gihan
author_facet	Jayakody, Ravindu Dias, Gihan
contents	Large Language Models (LLMs) have shown significant advances in the past year. In addition to new versions of GPT and Llama, several other LLMs have been introduced recently. Some of these are open models available for download and modification. Although multilingual large language models have been available for some time, their performance on low-resourced languages such as Sinhala has been poor. We evaluated four recent LLMs on their performance directly in the Sinhala language, and by translation to and from English. We also evaluated their fine-tunability with a small amount of fine-tuning data. Claude and GPT 4o perform well out-of-the-box and do significantly better than previous versions. Llama and Mistral perform poorly but show some promise of improvement with fine tuning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_21330
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Performance of Recent Large Language Models for a Low-Resourced Language Jayakody, Ravindu Dias, Gihan Computation and Language Large Language Models (LLMs) have shown significant advances in the past year. In addition to new versions of GPT and Llama, several other LLMs have been introduced recently. Some of these are open models available for download and modification. Although multilingual large language models have been available for some time, their performance on low-resourced languages such as Sinhala has been poor. We evaluated four recent LLMs on their performance directly in the Sinhala language, and by translation to and from English. We also evaluated their fine-tunability with a small amount of fine-tuning data. Claude and GPT 4o perform well out-of-the-box and do significantly better than previous versions. Llama and Mistral perform poorly but show some promise of improvement with fine tuning.
title	Performance of Recent Large Language Models for a Low-Resourced Language
topic	Computation and Language
url	https://arxiv.org/abs/2407.21330

Similar Items