Salvato in:
Dettagli Bibliografici
Autori principali: Ma, Tian, Feng, Kaiyu, Rong, Yu, Zhao, Kangfei
Natura: Preprint
Pubblicazione: 2025
Soggetti:
Accesso online:https://arxiv.org/abs/2509.04461
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866918135990321152
author Ma, Tian
Feng, Kaiyu
Rong, Yu
Zhao, Kangfei
author_facet Ma, Tian
Feng, Kaiyu
Rong, Yu
Zhao, Kangfei
contents Personality prediction from social media posts is a critical task that implies diverse applications in psychology and sociology. The Myers Briggs Type Indicator (MBTI), a popular personality inventory, has been traditionally predicted by machine learning (ML) and deep learning (DL) techniques. Recently, the success of Large Language Models (LLMs) has revealed their huge potential in understanding and inferring personality traits from social media content. However, directly exploiting LLMs for MBTI prediction faces two key challenges: the hallucination problem inherent in LLMs and the naturally imbalanced distribution of MBTI types in the population. In this paper, we propose PostToPersonality (PtoP), a novel LLM based framework for MBTI prediction from social media posts of individuals. Specifically, PtoP leverages Retrieval Augmented Generation with in context learning to mitigate hallucination in LLMs. Furthermore, we fine tune a pretrained LLM to improve model specification in MBTI understanding with synthetic minority oversampling, which balances the class imbalance by generating synthetic samples. Experiments conducted on a real world social media dataset demonstrate that PtoP achieves state of the art performance compared with 10 ML and DL baselines.
format Preprint
id arxiv_https___arxiv_org_abs_2509_04461
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle From Post To Personality: Harnessing LLMs for MBTI Prediction in Social Media
Ma, Tian
Feng, Kaiyu
Rong, Yu
Zhao, Kangfei
Computation and Language
Social and Information Networks
Personality prediction from social media posts is a critical task that implies diverse applications in psychology and sociology. The Myers Briggs Type Indicator (MBTI), a popular personality inventory, has been traditionally predicted by machine learning (ML) and deep learning (DL) techniques. Recently, the success of Large Language Models (LLMs) has revealed their huge potential in understanding and inferring personality traits from social media content. However, directly exploiting LLMs for MBTI prediction faces two key challenges: the hallucination problem inherent in LLMs and the naturally imbalanced distribution of MBTI types in the population. In this paper, we propose PostToPersonality (PtoP), a novel LLM based framework for MBTI prediction from social media posts of individuals. Specifically, PtoP leverages Retrieval Augmented Generation with in context learning to mitigate hallucination in LLMs. Furthermore, we fine tune a pretrained LLM to improve model specification in MBTI understanding with synthetic minority oversampling, which balances the class imbalance by generating synthetic samples. Experiments conducted on a real world social media dataset demonstrate that PtoP achieves state of the art performance compared with 10 ML and DL baselines.
title From Post To Personality: Harnessing LLMs for MBTI Prediction in Social Media
topic Computation and Language
Social and Information Networks
url https://arxiv.org/abs/2509.04461