Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Wang, Shengquan
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.05937
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912225923432448
author	Wang, Shengquan
author_facet	Wang, Shengquan
contents	This paper introduces a framework that connects a deep generative pre-trained Transformer language model with a generative adversarial network for semi-supervised text generation. In other words, the proposed model is first pre-trained unsupervised on a large and diverse text corpus with 24 layers. Then a simple GAN architecture for synthetic text generation is introduced, and Gumbel-Softmax is applied to handle the discreteness of tokens. The paper also shows a semi-supervised approach where real data is augmented with GAN samples, which is further used to fine-tune the Transformer model on the merged dataset. Detailed theoretical derivations are also included, outlining the proof of the min-max objective function, and an extensive discussion of the Gumbel-Softmax reparameterization trick.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_05937
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Semi-Supervised Text Generation Framework Combining a Deep Transformer and a GAN Wang, Shengquan Computation and Language Artificial Intelligence This paper introduces a framework that connects a deep generative pre-trained Transformer language model with a generative adversarial network for semi-supervised text generation. In other words, the proposed model is first pre-trained unsupervised on a large and diverse text corpus with 24 layers. Then a simple GAN architecture for synthetic text generation is introduced, and Gumbel-Softmax is applied to handle the discreteness of tokens. The paper also shows a semi-supervised approach where real data is augmented with GAN samples, which is further used to fine-tune the Transformer model on the merged dataset. Detailed theoretical derivations are also included, outlining the proof of the min-max objective function, and an extensive discussion of the Gumbel-Softmax reparameterization trick.
title	A Semi-Supervised Text Generation Framework Combining a Deep Transformer and a GAN
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2502.05937

Similar Items