Saved in:
Bibliographic Details
Main Author: Vanroy, Bram
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.04092
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929616868868096
author Vanroy, Bram
author_facet Vanroy, Bram
contents Language models have rapidly evolved, predominantly focusing on English while often neglecting extensive pretraining in other languages. This approach has required initiatives to adapt powerful, English-centric models to other linguistic contexts through finetuning. For Dutch, such a recent endeavour is ``GEITje'' a model originally derived from the English-based Mistral 7B. Building on this fundamental work, the current research extends the capabilities of GEITje by supervised finetuning on newly created high-quality synthetic conversational datasets, along with an additional preference alignment procedure on a synthetic feedback dataset. Both the developed models and the created datasets are openly available.
format Preprint
id arxiv_https___arxiv_org_abs_2412_04092
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle GEITje 7B Ultra: A Conversational Model for Dutch
Vanroy, Bram
Computation and Language
Language models have rapidly evolved, predominantly focusing on English while often neglecting extensive pretraining in other languages. This approach has required initiatives to adapt powerful, English-centric models to other linguistic contexts through finetuning. For Dutch, such a recent endeavour is ``GEITje'' a model originally derived from the English-based Mistral 7B. Building on this fundamental work, the current research extends the capabilities of GEITje by supervised finetuning on newly created high-quality synthetic conversational datasets, along with an additional preference alignment procedure on a synthetic feedback dataset. Both the developed models and the created datasets are openly available.
title GEITje 7B Ultra: A Conversational Model for Dutch
topic Computation and Language
url https://arxiv.org/abs/2412.04092