Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wu, Shihao, Yang, Junyi, Xu, Gongjun, Zhu, Ji
Format:	Preprint
Published:	2025
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2501.01541
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917487347499008
author	Wu, Shihao Yang, Junyi Xu, Gongjun Zhu, Ji
author_facet	Wu, Shihao Yang, Junyi Xu, Gongjun Zhu, Ji
contents	Hypergraph data, which capture multi-way interactions among entities, are increasingly prevalent in the big data era. Generating new hyperlinks from an observed, usually high-dimensional hypergraph is an important yet challenging task with diverse applications in areas such as electronic health record analysis and biological research. This task is fraught with several challenges. The discrete nature of hyperlinks renders many existing generative models inapplicable. Additionally, powerful machine learning-based generative models often operate as black boxes, providing limited interpretability. Key structural characteristics of hypergraphs, including node degree heterogeneity and hyperlink sparsity, further complicate the modeling process and must be carefully addressed. To tackle these challenges, we propose Denoising Diffused Embeddings (DDE), a general and efficient generative modeling architecture for hypergraphs. DDE exploits low-rank structure in high-dimensional hypergraphs via a conditional hyperlink likelihood model that links discrete hyperlinks to a continuous latent embedding space and leverages a score-based diffusion model to reconstruct that space. Theoretically, we show that when true latent embeddings are accessible, DDE exactly reduces the task of generating new high-dimensional hyperlinks to generating new low-dimensional embeddings. Moreover, we analyze the implications of using estimated embeddings in DDE, revealing how hypergraph characteristics such as dimensionality, node degree heterogeneity, and hyperlink sparsity impact its generative performance. Simulation studies demonstrate the superiority of DDE over existing methods, in terms of both computational efficiency and generative performance. Furthermore, an application to a symptom co-occurrence hypergraph derived from electronic medical records uncovers interesting findings and highlights the advantages of DDE.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_01541
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Denoising Diffused Embeddings: a Generative Approach for Hypergraphs Wu, Shihao Yang, Junyi Xu, Gongjun Zhu, Ji Methodology Hypergraph data, which capture multi-way interactions among entities, are increasingly prevalent in the big data era. Generating new hyperlinks from an observed, usually high-dimensional hypergraph is an important yet challenging task with diverse applications in areas such as electronic health record analysis and biological research. This task is fraught with several challenges. The discrete nature of hyperlinks renders many existing generative models inapplicable. Additionally, powerful machine learning-based generative models often operate as black boxes, providing limited interpretability. Key structural characteristics of hypergraphs, including node degree heterogeneity and hyperlink sparsity, further complicate the modeling process and must be carefully addressed. To tackle these challenges, we propose Denoising Diffused Embeddings (DDE), a general and efficient generative modeling architecture for hypergraphs. DDE exploits low-rank structure in high-dimensional hypergraphs via a conditional hyperlink likelihood model that links discrete hyperlinks to a continuous latent embedding space and leverages a score-based diffusion model to reconstruct that space. Theoretically, we show that when true latent embeddings are accessible, DDE exactly reduces the task of generating new high-dimensional hyperlinks to generating new low-dimensional embeddings. Moreover, we analyze the implications of using estimated embeddings in DDE, revealing how hypergraph characteristics such as dimensionality, node degree heterogeneity, and hyperlink sparsity impact its generative performance. Simulation studies demonstrate the superiority of DDE over existing methods, in terms of both computational efficiency and generative performance. Furthermore, an application to a symptom co-occurrence hypergraph derived from electronic medical records uncovers interesting findings and highlights the advantages of DDE.
title	Denoising Diffused Embeddings: a Generative Approach for Hypergraphs
topic	Methodology
url	https://arxiv.org/abs/2501.01541

Similar Items