Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Choi, ChangSu, Jeong, Yongbin, Park, Seoyoon, Won, InHo, Lim, HyeonSeok, Kim, SangMin, Kang, Yejee, Yoon, Chanhyuk, Park, Jaewan, Lee, Yiseul, Lee, HyeJin, Hahm, Younggyun, Kim, Hansaem, Lim, KyungTae
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.10882
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916169472016384
author	Choi, ChangSu Jeong, Yongbin Park, Seoyoon Won, InHo Lim, HyeonSeok Kim, SangMin Kang, Yejee Yoon, Chanhyuk Park, Jaewan Lee, Yiseul Lee, HyeJin Hahm, Younggyun Kim, Hansaem Lim, KyungTae
author_facet	Choi, ChangSu Jeong, Yongbin Park, Seoyoon Won, InHo Lim, HyeonSeok Kim, SangMin Kang, Yejee Yoon, Chanhyuk Park, Jaewan Lee, Yiseul Lee, HyeJin Hahm, Younggyun Kim, Hansaem Lim, KyungTae
contents	Large language models (LLMs) use pretraining to predict the subsequent word; however, their expansion requires significant computing resources. Numerous big tech companies and research institutes have developed multilingual LLMs (MLLMs) to meet current demands, overlooking less-resourced languages (LRLs). This study proposed three strategies to enhance the performance of LRLs based on the publicly available MLLMs. First, the MLLM vocabularies of LRLs were expanded to enhance expressiveness. Second, bilingual data were used for pretraining to align the high- and less-resourced languages. Third, a high-quality small-scale instruction dataset was constructed and instruction-tuning was performed to augment the LRL. The experiments employed the Llama2 model and Korean was used as the LRL, which was quantitatively evaluated against other developed LLMs across eight tasks. Furthermore, a qualitative assessment was performed based on human evaluation and GPT4. Experimental results showed that our proposed Bllossom model exhibited superior performance in qualitative analyses compared to previously proposed Korean monolingual models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_10882
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean Choi, ChangSu Jeong, Yongbin Park, Seoyoon Won, InHo Lim, HyeonSeok Kim, SangMin Kang, Yejee Yoon, Chanhyuk Park, Jaewan Lee, Yiseul Lee, HyeJin Hahm, Younggyun Kim, Hansaem Lim, KyungTae Computation and Language Artificial Intelligence Large language models (LLMs) use pretraining to predict the subsequent word; however, their expansion requires significant computing resources. Numerous big tech companies and research institutes have developed multilingual LLMs (MLLMs) to meet current demands, overlooking less-resourced languages (LRLs). This study proposed three strategies to enhance the performance of LRLs based on the publicly available MLLMs. First, the MLLM vocabularies of LRLs were expanded to enhance expressiveness. Second, bilingual data were used for pretraining to align the high- and less-resourced languages. Third, a high-quality small-scale instruction dataset was constructed and instruction-tuning was performed to augment the LRL. The experiments employed the Llama2 model and Korean was used as the LRL, which was quantitatively evaluated against other developed LLMs across eight tasks. Furthermore, a qualitative assessment was performed based on human evaluation and GPT4. Experimental results showed that our proposed Bllossom model exhibited superior performance in qualitative analyses compared to previously proposed Korean monolingual models.
title	Optimizing Language Augmentation for Multilingual Large Language Models: A Case Study on Korean
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2403.10882

Similar Items