Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Guo, Xingzhi, Skiena, Steven
Format:	Preprint
Published:	2022
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2211.01430
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915010571141120
author	Guo, Xingzhi Skiena, Steven
author_facet	Guo, Xingzhi Skiena, Steven
contents	Word and graph embeddings are widely used in deep learning applications. We present a data structure that captures inherent hierarchical properties from an unordered flat embedding space, particularly a sense of direction between pairs of entities. Inspired by the notion of \textit{distributional generality}, our algorithm constructs an arborescence (a directed rooted tree) by inserting nodes in descending order of entity power (e.g., word frequency), pointing each entity to the closest more powerful node as its parent. We evaluate the performance of the resulting tree structures on three tasks: hypernym relation discovery, least-common-ancestor (LCA) discovery among words, and Wikipedia page link recovery. We achieve average 8.98\% and 2.70\% for hypernym and LCA discovery across five languages and 62.76\% accuracy on directed Wiki-page link recovery, with both substantially above baselines. Finally, we investigate the effect of insertion order, the power/similarity trade-off and various power sources to optimize parent selection.
format	Preprint
id	arxiv_https___arxiv_org_abs_2211_01430
institution	arXiv
publishDate	2022
record_format	arxiv
spellingShingle	Hierarchies over Vector Space: Orienting Word and Graph Embeddings Guo, Xingzhi Skiena, Steven Computation and Language Word and graph embeddings are widely used in deep learning applications. We present a data structure that captures inherent hierarchical properties from an unordered flat embedding space, particularly a sense of direction between pairs of entities. Inspired by the notion of \textit{distributional generality}, our algorithm constructs an arborescence (a directed rooted tree) by inserting nodes in descending order of entity power (e.g., word frequency), pointing each entity to the closest more powerful node as its parent. We evaluate the performance of the resulting tree structures on three tasks: hypernym relation discovery, least-common-ancestor (LCA) discovery among words, and Wikipedia page link recovery. We achieve average 8.98\% and 2.70\% for hypernym and LCA discovery across five languages and 62.76\% accuracy on directed Wiki-page link recovery, with both substantially above baselines. Finally, we investigate the effect of insertion order, the power/similarity trade-off and various power sources to optimize parent selection.
title	Hierarchies over Vector Space: Orienting Word and Graph Embeddings
topic	Computation and Language
url	https://arxiv.org/abs/2211.01430

Similar Items