Saved in:
| Main Author: | |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.05937 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912225923432448 |
|---|---|
| author | Wang, Shengquan |
| author_facet | Wang, Shengquan |
| contents | This paper introduces a framework that connects a deep generative pre-trained Transformer language model with a generative adversarial network for semi-supervised text generation. In other words, the proposed model is first pre-trained unsupervised on a large and diverse text corpus with 24 layers. Then a simple GAN architecture for synthetic text generation is introduced, and Gumbel-Softmax is applied to handle the discreteness of tokens. The paper also shows a semi-supervised approach where real data is augmented with GAN samples, which is further used to fine-tune the Transformer model on the merged dataset. Detailed theoretical derivations are also included, outlining the proof of the min-max objective function, and an extensive discussion of the Gumbel-Softmax reparameterization trick. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2502_05937 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | A Semi-Supervised Text Generation Framework Combining a Deep Transformer and a GAN Wang, Shengquan Computation and Language Artificial Intelligence This paper introduces a framework that connects a deep generative pre-trained Transformer language model with a generative adversarial network for semi-supervised text generation. In other words, the proposed model is first pre-trained unsupervised on a large and diverse text corpus with 24 layers. Then a simple GAN architecture for synthetic text generation is introduced, and Gumbel-Softmax is applied to handle the discreteness of tokens. The paper also shows a semi-supervised approach where real data is augmented with GAN samples, which is further used to fine-tune the Transformer model on the merged dataset. Detailed theoretical derivations are also included, outlining the proof of the min-max objective function, and an extensive discussion of the Gumbel-Softmax reparameterization trick. |
| title | A Semi-Supervised Text Generation Framework Combining a Deep Transformer and a GAN |
| topic | Computation and Language Artificial Intelligence |
| url | https://arxiv.org/abs/2502.05937 |