Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Dezhi, Zhang, Richong, Wang, Ziqiao
Format:	Preprint
Published:	2020
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2009.04413
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912741113987072
author	Liu, Dezhi Zhang, Richong Wang, Ziqiao
author_facet	Liu, Dezhi Zhang, Richong Wang, Ziqiao
contents	SkipGram word embedding models with negative sampling, or SGN in short, is an elegant family of word embedding models. In this paper, we formulate a framework for word embedding, referred to as Word-Context Classification (WCC), that generalizes SGN to a wide family of models. The framework, which uses some ``noise examples'', is justified through theoretical analysis. The impact of noise distribution on the learning of the WCC embedding models is studied experimentally, suggesting that the best noise distribution is, in fact, the data distribution, in terms of both the embedding performance and the speed of convergence during training. Along our way, we discover several novel embedding models that outperform existing WCC models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2009_04413
institution	arXiv
publishDate	2020
record_format	arxiv
spellingShingle	On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions Liu, Dezhi Zhang, Richong Wang, Ziqiao Computation and Language Machine Learning SkipGram word embedding models with negative sampling, or SGN in short, is an elegant family of word embedding models. In this paper, we formulate a framework for word embedding, referred to as Word-Context Classification (WCC), that generalizes SGN to a wide family of models. The framework, which uses some ``noise examples'', is justified through theoretical analysis. The impact of noise distribution on the learning of the WCC embedding models is studied experimentally, suggesting that the best noise distribution is, in fact, the data distribution, in terms of both the embedding performance and the speed of convergence during training. Along our way, we discover several novel embedding models that outperform existing WCC models.
title	On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2009.04413

Similar Items