Saved in:
Bibliographic Details
Main Authors: Rony, Sidharth, Patman, Jack
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.23057
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911292110929920
author Rony, Sidharth
Patman, Jack
author_facet Rony, Sidharth
Patman, Jack
contents Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy of up to 61% for the lower (or fourth) tier of SOC, and 72% for the third tier of SOC. This model could provide up to date, accurate information on the evolution of the labour market using job advertisements.
format Preprint
id arxiv_https___arxiv_org_abs_2511_23057
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Standard Occupation Classifier -- A Natural Language Processing Approach
Rony, Sidharth
Patman, Jack
Computation and Language
Machine Learning
General Economics
Economics
Standard Occupational Classifiers (SOC) are systems used to categorize and classify different types of jobs and occupations based on their similarities in terms of job duties, skills, and qualifications. Integrating these facets with Big Data from job advertisement offers the prospect to investigate labour demand that is specific to various occupations. This project investigates the use of recent developments in natural language processing to construct a classifier capable of assigning an occupation code to a given job advertisement. We develop various classifiers for both UK ONS SOC and US O*NET SOC, using different Language Models. We find that an ensemble model, which combines Google BERT and a Neural Network classifier while considering job title, description, and skills, achieved the highest prediction accuracy. Specifically, the ensemble model exhibited a classification accuracy of up to 61% for the lower (or fourth) tier of SOC, and 72% for the third tier of SOC. This model could provide up to date, accurate information on the evolution of the labour market using job advertisements.
title Standard Occupation Classifier -- A Natural Language Processing Approach
topic Computation and Language
Machine Learning
General Economics
Economics
url https://arxiv.org/abs/2511.23057