Saved in:
Bibliographic Details
Main Authors: Cai, Tianxi, Xia, Dong, Zhang, Luwan, Zhou, Doudou
Format: Preprint
Published: 2022
Subjects:
Online Access:https://arxiv.org/abs/2209.13762
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909335910612992
author Cai, Tianxi
Xia, Dong
Zhang, Luwan
Zhou, Doudou
author_facet Cai, Tianxi
Xia, Dong
Zhang, Luwan
Zhou, Doudou
contents Network analysis has been a powerful tool to unveil relationships and interactions among a large number of objects. Yet its effectiveness in accurately identifying important node-node interactions is challenged by the rapidly growing network size, with data being collected at an unprecedented granularity and scale. Common wisdom to overcome such high dimensionality is collapsing nodes into smaller groups and conducting connectivity analysis on the group level. Dividing efforts into two phases inevitably opens a gap in consistency and drives down efficiency. Consensus learning emerges as a new normal for common knowledge discovery with multiple data sources available. In this paper, we propose a unified multi-view sparse low-rank block model (msLBM) framework, which enables simultaneous grouping and connectivity analysis by combining multiple data sources. The msLBM framework efficiently represents overlapping information across large scale concepts and accommodates different types of heterogeneity across sources. Both features are desirable when analyzing high dimensional electronic health record (EHR) datasets from multiple health systems. An estimating procedure based on the alternating minimization algorithm is proposed. Our theoretical results demonstrate that a consensus knowledge graph can be more accurately learned by leveraging multi-source datasets, and statistically optimal rates can be achieved under mild conditions. Applications to the real world EHR data suggest that our proposed msLBM algorithm can more reliably reveal network structure among clinical concepts by effectively combining summary level EHR data from multiple health systems.
format Preprint
id arxiv_https___arxiv_org_abs_2209_13762
institution arXiv
publishDate 2022
record_format arxiv
spellingShingle Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model
Cai, Tianxi
Xia, Dong
Zhang, Luwan
Zhou, Doudou
Machine Learning
Network analysis has been a powerful tool to unveil relationships and interactions among a large number of objects. Yet its effectiveness in accurately identifying important node-node interactions is challenged by the rapidly growing network size, with data being collected at an unprecedented granularity and scale. Common wisdom to overcome such high dimensionality is collapsing nodes into smaller groups and conducting connectivity analysis on the group level. Dividing efforts into two phases inevitably opens a gap in consistency and drives down efficiency. Consensus learning emerges as a new normal for common knowledge discovery with multiple data sources available. In this paper, we propose a unified multi-view sparse low-rank block model (msLBM) framework, which enables simultaneous grouping and connectivity analysis by combining multiple data sources. The msLBM framework efficiently represents overlapping information across large scale concepts and accommodates different types of heterogeneity across sources. Both features are desirable when analyzing high dimensional electronic health record (EHR) datasets from multiple health systems. An estimating procedure based on the alternating minimization algorithm is proposed. Our theoretical results demonstrate that a consensus knowledge graph can be more accurately learned by leveraging multi-source datasets, and statistically optimal rates can be achieved under mild conditions. Applications to the real world EHR data suggest that our proposed msLBM algorithm can more reliably reveal network structure among clinical concepts by effectively combining summary level EHR data from multiple health systems.
title Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model
topic Machine Learning
url https://arxiv.org/abs/2209.13762