Saved in:
Bibliographic Details
Main Authors: Ding, Zhanyi, Wang, Zhongyan, Zhang, Yeyubei, Cao, Yuchen, Liu, Yunchong, Shen, Xiaorui, Tian, Yexin, Dai, Jianglai
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.01082
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916707118874624
author Ding, Zhanyi
Wang, Zhongyan
Zhang, Yeyubei
Cao, Yuchen
Liu, Yunchong
Shen, Xiaorui
Tian, Yexin
Dai, Jianglai
author_facet Ding, Zhanyi
Wang, Zhongyan
Zhang, Yeyubei
Cao, Yuchen
Liu, Yunchong
Shen, Xiaorui
Tian, Yexin
Dai, Jianglai
contents Social media platforms provide valuable insights into mental health trends by capturing user-generated discussions on conditions such as depression, anxiety, and suicidal ideation. Machine learning (ML) and deep learning (DL) models have been increasingly applied to classify mental health conditions from textual data, but selecting the most effective model involves trade-offs in accuracy, interpretability, and computational efficiency. This study evaluates multiple ML models, including logistic regression, random forest, and LightGBM, alongside deep learning architectures such as ALBERT and Gated Recurrent Units (GRUs), for both binary and multi-class classification of mental health conditions. Our findings indicate that ML and DL models achieve comparable classification performance on medium-sized datasets, with ML models offering greater interpretability through variable importance scores, while DL models are more robust to complex linguistic patterns. Additionally, ML models require explicit feature engineering, whereas DL models learn hierarchical representations directly from text. Logistic regression provides the advantage of capturing both positive and negative associations between features and mental health conditions, whereas tree-based models prioritize decision-making power through split-based feature selection. This study offers empirical insights into the advantages and limitations of different modeling approaches and provides recommendations for selecting appropriate methods based on dataset size, interpretability needs, and computational constraints.
format Preprint
id arxiv_https___arxiv_org_abs_2503_01082
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media
Ding, Zhanyi
Wang, Zhongyan
Zhang, Yeyubei
Cao, Yuchen
Liu, Yunchong
Shen, Xiaorui
Tian, Yexin
Dai, Jianglai
Computation and Language
Social media platforms provide valuable insights into mental health trends by capturing user-generated discussions on conditions such as depression, anxiety, and suicidal ideation. Machine learning (ML) and deep learning (DL) models have been increasingly applied to classify mental health conditions from textual data, but selecting the most effective model involves trade-offs in accuracy, interpretability, and computational efficiency. This study evaluates multiple ML models, including logistic regression, random forest, and LightGBM, alongside deep learning architectures such as ALBERT and Gated Recurrent Units (GRUs), for both binary and multi-class classification of mental health conditions. Our findings indicate that ML and DL models achieve comparable classification performance on medium-sized datasets, with ML models offering greater interpretability through variable importance scores, while DL models are more robust to complex linguistic patterns. Additionally, ML models require explicit feature engineering, whereas DL models learn hierarchical representations directly from text. Logistic regression provides the advantage of capturing both positive and negative associations between features and mental health conditions, whereas tree-based models prioritize decision-making power through split-based feature selection. This study offers empirical insights into the advantages and limitations of different modeling approaches and provides recommendations for selecting appropriate methods based on dataset size, interpretability needs, and computational constraints.
title Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media
topic Computation and Language
url https://arxiv.org/abs/2503.01082