Saved in:
Bibliographic Details
Main Authors: Han, Lu, Li, Mengyan, Qiang, Jiping, Su, Zhi
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.00546
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916927486558208
author Han, Lu
Li, Mengyan
Qiang, Jiping
Su, Zhi
author_facet Han, Lu
Li, Mengyan
Qiang, Jiping
Su, Zhi
contents Heterogeneous data, which encompass both numerical financial variables and textual records, present substantial challenges for credit monitoring. To address this issue, we propose Advanced Spectral Clustering (ASC), a method that integrates financial and textual similarities through an optimized weight parameter and selects eigenvectors using a novel eigenvalue-silhouette optimization approach. Evaluated on a dataset comprising 1,428 small and medium-sized enterprises (SMEs), ASC achieves a Silhouette score that is 18% higher than that of a single-type data baseline method. Furthermore, the resulting clusters offer actionable insights; for instance, 51% of low-risk firms are found to include the term 'social recruitment' in their textual records. The robustness of ASC is confirmed across multiple clustering algorithms, including k-means, k-medians, and k-medoids, with ΔIntra/Inter < 0.13 and ΔSilhouette Coefficient < 0.02. By bridging spectral clustering theory with heterogeneous data applications, ASC enables the identification of meaningful clusters, such as recruitment-focused SMEs exhibiting a 30% lower default risk, thereby supporting more targeted and effective credit interventions.
format Preprint
id arxiv_https___arxiv_org_abs_2509_00546
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Advanced spectral clustering for heterogeneous data in credit risk monitoring systems
Han, Lu
Li, Mengyan
Qiang, Jiping
Su, Zhi
Machine Learning
Computation and Language
Heterogeneous data, which encompass both numerical financial variables and textual records, present substantial challenges for credit monitoring. To address this issue, we propose Advanced Spectral Clustering (ASC), a method that integrates financial and textual similarities through an optimized weight parameter and selects eigenvectors using a novel eigenvalue-silhouette optimization approach. Evaluated on a dataset comprising 1,428 small and medium-sized enterprises (SMEs), ASC achieves a Silhouette score that is 18% higher than that of a single-type data baseline method. Furthermore, the resulting clusters offer actionable insights; for instance, 51% of low-risk firms are found to include the term 'social recruitment' in their textual records. The robustness of ASC is confirmed across multiple clustering algorithms, including k-means, k-medians, and k-medoids, with ΔIntra/Inter < 0.13 and ΔSilhouette Coefficient < 0.02. By bridging spectral clustering theory with heterogeneous data applications, ASC enables the identification of meaningful clusters, such as recruitment-focused SMEs exhibiting a 30% lower default risk, thereby supporting more targeted and effective credit interventions.
title Advanced spectral clustering for heterogeneous data in credit risk monitoring systems
topic Machine Learning
Computation and Language
url https://arxiv.org/abs/2509.00546