Saved in:
Bibliographic Details
Main Authors: Siddiqui, Saad Mashkoor, Sheikh, Mohammad Ali, Aleem, Muhammad, Singh, Kajol R
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.08271
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle. Specifically, we compare classification performance and time complexity of three transformer models, namely DistilBERT, ELECTRA, and BART, using conventional fine-tuning as well as nine state-of-the-art (SoTA) adapter architectures. Our analysis reveals performance differences across adapter architectures, highlighting their ability to achieve comparable or better performance relative to fine-tuning at a fraction of the training time. Similar results are observed on the new classification task, further supporting our findings and demonstrating adapters as efficient and flexible alternatives to fine-tuning. This study provides valuable insights and guidelines for selecting and implementing adapters in diverse natural language processing (NLP) applications.