Saved in:
Bibliographic Details
Main Authors: Bakker, Samirah, Ma, Yao, Ziabari, Seyed Sahand Mohammadi
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.01924
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918080249069568
author Bakker, Samirah
Ma, Yao
Ziabari, Seyed Sahand Mohammadi
author_facet Bakker, Samirah
Ma, Yao
Ziabari, Seyed Sahand Mohammadi
contents The complexity of mental healthcare billing enables anomalies, including fraud. While machine learning methods have been applied to anomaly detection, they often struggle with class imbalance, label scarcity, and complex sequential patterns. This study explores a hybrid deep learning approach combining Long Short-Term Memory (LSTM) networks and Transformers, with pseudo-labeling via Isolation Forests (iForest) and Autoencoders (AE). Prior work has not evaluated such hybrid models trained on pseudo-labeled data in the context of healthcare billing. The approach is evaluated on two real-world billing datasets related to mental healthcare. The iForest LSTM baseline achieves the highest recall (0.963) on declaration-level data. On the operation-level data, the hybrid iForest-based model achieves the highest recall (0.744), though at the cost of lower precision. These findings highlight the potential of combining pseudo-labeling with hybrid deep learning in complex, imbalanced anomaly detection settings.
format Preprint
id arxiv_https___arxiv_org_abs_2507_01924
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection
Bakker, Samirah
Ma, Yao
Ziabari, Seyed Sahand Mohammadi
Machine Learning
Artificial Intelligence
The complexity of mental healthcare billing enables anomalies, including fraud. While machine learning methods have been applied to anomaly detection, they often struggle with class imbalance, label scarcity, and complex sequential patterns. This study explores a hybrid deep learning approach combining Long Short-Term Memory (LSTM) networks and Transformers, with pseudo-labeling via Isolation Forests (iForest) and Autoencoders (AE). Prior work has not evaluated such hybrid models trained on pseudo-labeled data in the context of healthcare billing. The approach is evaluated on two real-world billing datasets related to mental healthcare. The iForest LSTM baseline achieves the highest recall (0.963) on declaration-level data. On the operation-level data, the hybrid iForest-based model achieves the highest recall (0.744), though at the cost of lower precision. These findings highlight the potential of combining pseudo-labeling with hybrid deep learning in complex, imbalanced anomaly detection settings.
title Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2507.01924