Saved in:
Bibliographic Details
Main Authors: Li, Wenlin, Xu, Yucheng, Zheng, Xiaoqing, Han, Suoya, Wang, Jun, Sun, Xiaobo
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.01781
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909316089380864
author Li, Wenlin
Xu, Yucheng
Zheng, Xiaoqing
Han, Suoya
Wang, Jun
Sun, Xiaobo
author_facet Li, Wenlin
Xu, Yucheng
Zheng, Xiaoqing
Han, Suoya
Wang, Jun
Sun, Xiaobo
contents Sparse and noisy images (SNIs), like those in spatial gene expression data, pose significant challenges for effective representation learning and clustering, which are essential for thorough data analysis and interpretation. In response to these challenges, we propose Dual Advancement of Representation Learning and Clustering (DARLC), an innovative framework that leverages contrastive learning to enhance the representations derived from masked image modeling. Simultaneously, DARLC integrates cluster assignments in a cohesive, end-to-end approach. This integrated clustering strategy addresses the "class collision problem" inherent in contrastive learning, thus improving the quality of the resulting representations. To generate more plausible positive views for contrastive learning, we employ a graph attention network-based technique that produces denoised images as augmented data. As such, our framework offers a comprehensive approach that improves the learning of representations by enhancing their local perceptibility, distinctiveness, and the understanding of relational semantics. Furthermore, we utilize a Student's t mixture model to achieve more robust and adaptable clustering of SNIs. Extensive experiments, conducted across 12 different types of datasets consisting of SNIs, demonstrate that DARLC surpasses the state-of-the-art methods in both image clustering and generating image representations that accurately capture gene interactions. Code is available at https://github.com/zipging/DARLC.
format Preprint
id arxiv_https___arxiv_org_abs_2409_01781
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images
Li, Wenlin
Xu, Yucheng
Zheng, Xiaoqing
Han, Suoya
Wang, Jun
Sun, Xiaobo
Computer Vision and Pattern Recognition
Sparse and noisy images (SNIs), like those in spatial gene expression data, pose significant challenges for effective representation learning and clustering, which are essential for thorough data analysis and interpretation. In response to these challenges, we propose Dual Advancement of Representation Learning and Clustering (DARLC), an innovative framework that leverages contrastive learning to enhance the representations derived from masked image modeling. Simultaneously, DARLC integrates cluster assignments in a cohesive, end-to-end approach. This integrated clustering strategy addresses the "class collision problem" inherent in contrastive learning, thus improving the quality of the resulting representations. To generate more plausible positive views for contrastive learning, we employ a graph attention network-based technique that produces denoised images as augmented data. As such, our framework offers a comprehensive approach that improves the learning of representations by enhancing their local perceptibility, distinctiveness, and the understanding of relational semantics. Furthermore, we utilize a Student's t mixture model to achieve more robust and adaptable clustering of SNIs. Extensive experiments, conducted across 12 different types of datasets consisting of SNIs, demonstrate that DARLC surpasses the state-of-the-art methods in both image clustering and generating image representations that accurately capture gene interactions. Code is available at https://github.com/zipging/DARLC.
title Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2409.01781