Saved in:
Bibliographic Details
Main Authors: Li, Quan, Jing, Shixiong, Chen, Lingwei
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.06802
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910295959535616
author Li, Quan
Jing, Shixiong
Chen, Lingwei
author_facet Li, Quan
Jing, Shixiong
Chen, Lingwei
contents The popularization of social media increases user engagements and generates a large amount of user-oriented data. Among them, text data (e.g., tweets, blogs) significantly attracts researchers and speculators to infer user attributes (e.g., age, gender, location) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks (GNNs) to utilize higher-level representations of source texts. However, these text graphs are constructed over words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for attribute inferences on social media text data. Our model first constructs and refines a text graph using manifold learning and message passing, which offers a better trade-off between expressiveness and complexity. Afterwards, to further use cross-domain texts and unlabeled texts to improve few-shot performance, a hierarchical knowledge distillation is devised over text graph to optimize the problem, which derives better text representations, and advances model generalization ability. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.
format Preprint
id arxiv_https___arxiv_org_abs_2401_06802
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Hierarchical Knowledge Distillation on Text Graph for Data-limited Attribute Inference
Li, Quan
Jing, Shixiong
Chen, Lingwei
Computation and Language
Machine Learning
Social and Information Networks
The popularization of social media increases user engagements and generates a large amount of user-oriented data. Among them, text data (e.g., tweets, blogs) significantly attracts researchers and speculators to infer user attributes (e.g., age, gender, location) for fulfilling their intents. Generally, this line of work casts attribute inference as a text classification problem, and starts to leverage graph neural networks (GNNs) to utilize higher-level representations of source texts. However, these text graphs are constructed over words, suffering from high memory consumption and ineffectiveness on few labeled texts. To address this challenge, we design a text-graph-based few-shot learning model for attribute inferences on social media text data. Our model first constructs and refines a text graph using manifold learning and message passing, which offers a better trade-off between expressiveness and complexity. Afterwards, to further use cross-domain texts and unlabeled texts to improve few-shot performance, a hierarchical knowledge distillation is devised over text graph to optimize the problem, which derives better text representations, and advances model generalization ability. Experiments on social media datasets demonstrate the state-of-the-art performance of our model on attribute inferences with considerably fewer labeled texts.
title Hierarchical Knowledge Distillation on Text Graph for Data-limited Attribute Inference
topic Computation and Language
Machine Learning
Social and Information Networks
url https://arxiv.org/abs/2401.06802