Saved in:
Bibliographic Details
Main Author: Gupta, Kartik
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.16274
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909506151120896
author Gupta, Kartik
author_facet Gupta, Kartik
contents The Qwen 2.5 3B base model was fine-tuned to generate contextually rich and engaging movie dialogue, leveraging the Cornell Movie-Dialog Corpus, a curated dataset of movie conversations. Due to the limitations in GPU computing and VRAM, the training process began with the 0.5B model progressively scaling up to the 1.5B and 3B versions as efficiency improvements were implemented. The Qwen 2.5 series, developed by Alibaba Group, stands at the forefront of small open-source pre-trained models, particularly excelling in creative tasks compared to alternatives like Meta's Llama 3.2 and Google's Gemma. Results demonstrate the ability of small models to produce high-quality, realistic dialogue, offering a promising approach for real-time, context-sensitive conversation generation.
format Preprint
id arxiv_https___arxiv_org_abs_2502_16274
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
Gupta, Kartik
Computation and Language
Artificial Intelligence
The Qwen 2.5 3B base model was fine-tuned to generate contextually rich and engaging movie dialogue, leveraging the Cornell Movie-Dialog Corpus, a curated dataset of movie conversations. Due to the limitations in GPU computing and VRAM, the training process began with the 0.5B model progressively scaling up to the 1.5B and 3B versions as efficiency improvements were implemented. The Qwen 2.5 series, developed by Alibaba Group, stands at the forefront of small open-source pre-trained models, particularly excelling in creative tasks compared to alternatives like Meta's Llama 3.2 and Google's Gemma. Results demonstrate the ability of small models to produce high-quality, realistic dialogue, offering a promising approach for real-time, context-sensitive conversation generation.
title Fine-Tuning Qwen 2.5 3B for Realistic Movie Dialogue Generation
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2502.16274