Saved in:
Bibliographic Details
Main Authors: Xiao, Fei, Cai, Shaofeng, Chen, Gang, Jagadish, H. V., Ooi, Beng Chin, Zhang, Meihui
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.00513
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911974619611136
author Xiao, Fei
Cai, Shaofeng
Chen, Gang
Jagadish, H. V.
Ooi, Beng Chin
Zhang, Meihui
author_facet Xiao, Fei
Cai, Shaofeng
Chen, Gang
Jagadish, H. V.
Ooi, Beng Chin
Zhang, Meihui
contents Fraud detection presents a challenging task characterized by ever-evolving fraud patterns and scarce labeled data. Existing methods predominantly rely on graph-based or sequence-based approaches. While graph-based approaches connect users through shared entities to capture structural information, they remain vulnerable to fraudsters who can disrupt or manipulate these connections. In contrast, sequence-based approaches analyze users' behavioral patterns, offering robustness against tampering but overlooking the interactions between similar users. Inspired by cohort analysis in retention and healthcare, this paper introduces VecAug, a novel cohort-augmented learning framework that addresses these challenges by enhancing the representation learning of target users with personalized cohort information. To this end, we first propose a vector burn-in technique for automatic cohort identification, which retrieves a task-specific cohort for each target user. Then, to fully exploit the cohort information, we introduce an attentive cohort aggregation technique for augmenting target user representations. To improve the robustness of such cohort augmentation, we also propose a novel label-aware cohort neighbor separation mechanism to distance negative cohort neighbors and calibrate the aggregated cohort information. By integrating this cohort information with target user representations, VecAug enhances the modeling capacity and generalization capabilities of the model to be augmented. Our framework is flexible and can be seamlessly integrated with existing fraud detection models. We deploy our framework on e-commerce platforms and evaluate it on three fraud detection datasets, and results show that VecAug improves the detection performance of base models by up to 2.48\% in AUC and 22.5\% in R@P$_{0.9}$, outperforming state-of-the-art methods significantly.
format Preprint
id arxiv_https___arxiv_org_abs_2408_00513
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle VecAug: Unveiling Camouflaged Frauds with Cohort Augmentation for Enhanced Detection
Xiao, Fei
Cai, Shaofeng
Chen, Gang
Jagadish, H. V.
Ooi, Beng Chin
Zhang, Meihui
Machine Learning
Fraud detection presents a challenging task characterized by ever-evolving fraud patterns and scarce labeled data. Existing methods predominantly rely on graph-based or sequence-based approaches. While graph-based approaches connect users through shared entities to capture structural information, they remain vulnerable to fraudsters who can disrupt or manipulate these connections. In contrast, sequence-based approaches analyze users' behavioral patterns, offering robustness against tampering but overlooking the interactions between similar users. Inspired by cohort analysis in retention and healthcare, this paper introduces VecAug, a novel cohort-augmented learning framework that addresses these challenges by enhancing the representation learning of target users with personalized cohort information. To this end, we first propose a vector burn-in technique for automatic cohort identification, which retrieves a task-specific cohort for each target user. Then, to fully exploit the cohort information, we introduce an attentive cohort aggregation technique for augmenting target user representations. To improve the robustness of such cohort augmentation, we also propose a novel label-aware cohort neighbor separation mechanism to distance negative cohort neighbors and calibrate the aggregated cohort information. By integrating this cohort information with target user representations, VecAug enhances the modeling capacity and generalization capabilities of the model to be augmented. Our framework is flexible and can be seamlessly integrated with existing fraud detection models. We deploy our framework on e-commerce platforms and evaluate it on three fraud detection datasets, and results show that VecAug improves the detection performance of base models by up to 2.48\% in AUC and 22.5\% in R@P$_{0.9}$, outperforming state-of-the-art methods significantly.
title VecAug: Unveiling Camouflaged Frauds with Cohort Augmentation for Enhanced Detection
topic Machine Learning
url https://arxiv.org/abs/2408.00513