Saved in:
Bibliographic Details
Main Authors: Wang, Rui, Zheng, Yi, Wang, Dongxin, Huang, Haiping, Yao, Yuanzhi, Zhou, Yuxiang, Yu, Jialin, Torr, Philip
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.12663
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Existing topic modeling methods, from LDA to recent neural and LLM-based approaches, which focus mainly on statistical coherence, often produce redundant or off-target topics that miss the user's underlying intent. We introduce Human-centric Topic Modeling, \emph{Human-TM}), a novel task formulation that integrates a human-provided goal directly into the topic modeling process to produce interpretable, diverse and goal-oriented topics. To tackle this challenge, we propose the \textbf{G}oal-prompted \textbf{C}ontrastive \textbf{T}opic \textbf{M}odel with \textbf{O}ptimal \textbf{T}ransport (GCTM-OT), which first uses LLM-based prompting to extract goal candidates from documents, then incorporates these into semantic-aware contrastive learning via optimal transport for topic discovery. Experimental results on three public subreddit datasets show that GCTM-OT outperforms state-of-the-art baselines in topic coherence and diversity while significantly improving alignment with human-provided goals, paving the way for more human-centric topic discovery systems.