Saved in:
Bibliographic Details
Main Authors: Fang, Guanwen, Dai, Yao, Lin, Zesen, Zhou, Chichun, Song, Jie, Gu, Yizhou, Guo, Xiaotong, Mao, Anqi, Kong, Xu
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2501.00380
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915087111946240
author Fang, Guanwen
Dai, Yao
Lin, Zesen
Zhou, Chichun
Song, Jie
Gu, Yizhou
Guo, Xiaotong
Mao, Anqi
Kong, Xu
author_facet Fang, Guanwen
Dai, Yao
Lin, Zesen
Zhou, Chichun
Song, Jie
Gu, Yizhou
Guo, Xiaotong
Mao, Anqi
Kong, Xu
contents In this work, we update the unsupervised machine learning (UML) step by proposing an algorithm based on ConvNeXt large model coding to improve the efficiency of unlabeled galaxy morphology classifications. The method can be summarized into three key aspects as follows: (1) a convolutional autoencoder is used for image denoising and reconstruction and the rotational invariance of the model is improved by polar coordinate extension; (2) utilizing a pre-trained convolutional neural network (CNN) named ConvNeXt for encoding the image data. The features were further compressed via a principal component analysis (PCA) dimensionality reduction; (3) adopting a bagging-based multi-model voting classification algorithm to enhance robustness. We applied this model to I-band images of a galaxy sample with $I_{\rm mag}< 25$ in the COSMOS field. Compared to the original unsupervised method, the number of clustering groups required by the new method is reduced from 100 to 20. Finally, we managed to classify about 53\% galaxies, significantly improving the classification efficiency. To verify the validity of the morphological classification, we selected massive galaxies with $M(*)>10^{10}(M(sun))$ for morphological parameter tests. The corresponding rules between the classification results and the physical properties of galaxies on multiple parameter surfaces are consistent with the existing evolution model. Our method has demonstrated the feasibility of using large model encoding to classify galaxy morphology, which not only improves the efficiency of galaxy morphology classification, but also saves time and manpower. Furthermore, in comparison to the original UML model, the enhanced classification performance is more evident in qualitative analysis and has successfully surpassed a greater number of parameter tests.
format Preprint
id arxiv_https___arxiv_org_abs_2501_00380
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle An efficient unsupervised classification model for galaxy morphology: Voting clustering based on coding from ConvNeXt large model
Fang, Guanwen
Dai, Yao
Lin, Zesen
Zhou, Chichun
Song, Jie
Gu, Yizhou
Guo, Xiaotong
Mao, Anqi
Kong, Xu
Astrophysics of Galaxies
In this work, we update the unsupervised machine learning (UML) step by proposing an algorithm based on ConvNeXt large model coding to improve the efficiency of unlabeled galaxy morphology classifications. The method can be summarized into three key aspects as follows: (1) a convolutional autoencoder is used for image denoising and reconstruction and the rotational invariance of the model is improved by polar coordinate extension; (2) utilizing a pre-trained convolutional neural network (CNN) named ConvNeXt for encoding the image data. The features were further compressed via a principal component analysis (PCA) dimensionality reduction; (3) adopting a bagging-based multi-model voting classification algorithm to enhance robustness. We applied this model to I-band images of a galaxy sample with $I_{\rm mag}< 25$ in the COSMOS field. Compared to the original unsupervised method, the number of clustering groups required by the new method is reduced from 100 to 20. Finally, we managed to classify about 53\% galaxies, significantly improving the classification efficiency. To verify the validity of the morphological classification, we selected massive galaxies with $M(*)>10^{10}(M(sun))$ for morphological parameter tests. The corresponding rules between the classification results and the physical properties of galaxies on multiple parameter surfaces are consistent with the existing evolution model. Our method has demonstrated the feasibility of using large model encoding to classify galaxy morphology, which not only improves the efficiency of galaxy morphology classification, but also saves time and manpower. Furthermore, in comparison to the original UML model, the enhanced classification performance is more evident in qualitative analysis and has successfully surpassed a greater number of parameter tests.
title An efficient unsupervised classification model for galaxy morphology: Voting clustering based on coding from ConvNeXt large model
topic Astrophysics of Galaxies
url https://arxiv.org/abs/2501.00380