Saved in:
Bibliographic Details
Main Author: Song, Yubo
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.08570
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910446597963776
author Song, Yubo
author_facet Song, Yubo
contents This article explores the adaptive relationship between Encoder Layers and Decoder Layers using the SOTA model Helsinki-NLP/opus-mt-de-en, which translates German to English. The specific method involves introducing a bias-free fully connected layer between the Encoder and Decoder, with different initializations of the layer's weights, and observing the outcomes of fine-tuning versus retraining. Four experiments were conducted in total. The results suggest that directly modifying the pre-trained model structure for fine-tuning yields suboptimal performance. However, upon observing the outcomes of the experiments with retraining, this structural adjustment shows significant potential.
format Preprint
id arxiv_https___arxiv_org_abs_2405_08570
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Rethinking the adaptive relationship between Encoder Layers and Decoder Layers
Song, Yubo
Computation and Language
This article explores the adaptive relationship between Encoder Layers and Decoder Layers using the SOTA model Helsinki-NLP/opus-mt-de-en, which translates German to English. The specific method involves introducing a bias-free fully connected layer between the Encoder and Decoder, with different initializations of the layer's weights, and observing the outcomes of fine-tuning versus retraining. Four experiments were conducted in total. The results suggest that directly modifying the pre-trained model structure for fine-tuning yields suboptimal performance. However, upon observing the outcomes of the experiments with retraining, this structural adjustment shows significant potential.
title Rethinking the adaptive relationship between Encoder Layers and Decoder Layers
topic Computation and Language
url https://arxiv.org/abs/2405.08570