Saved in:
Bibliographic Details
Main Author: Kamfonas, Michael
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2505.09792
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913839826599936
author Kamfonas, Michael
author_facet Kamfonas, Michael
contents This case study applies a phased hyperparameter optimization process to compare multitask natural language model variants that utilize multiphase learning rate scheduling and optimizer parameter grouping. We employ short, Bayesian optimization sessions that leverage multi-fidelity, hyperparameter space pruning, progressive halving, and a degree of human guidance. We utilize the Optuna TPE sampler and Hyperband pruner, as well as the Scikit-Learn Gaussian process minimization. Initially, we use efficient low-fidelity sprints to prune the hyperparameter space. Subsequent sprints progressively increase their model fidelity and employ hyperband pruning for efficiency. A second aspect of our approach is using a meta-learner to tune threshold values to resolve classification probabilities during inference. We demonstrate our method on a collection of variants of the 2021 Joint Entity and Relation Extraction model proposed by Eberts and Ulges.
format Preprint
id arxiv_https___arxiv_org_abs_2505_09792
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Interim Report on Human-Guided Adaptive Hyperparameter Optimization with Multi-Fidelity Sprints
Kamfonas, Michael
Computation and Language
Machine Learning
This case study applies a phased hyperparameter optimization process to compare multitask natural language model variants that utilize multiphase learning rate scheduling and optimizer parameter grouping. We employ short, Bayesian optimization sessions that leverage multi-fidelity, hyperparameter space pruning, progressive halving, and a degree of human guidance. We utilize the Optuna TPE sampler and Hyperband pruner, as well as the Scikit-Learn Gaussian process minimization. Initially, we use efficient low-fidelity sprints to prune the hyperparameter space. Subsequent sprints progressively increase their model fidelity and employ hyperband pruning for efficiency. A second aspect of our approach is using a meta-learner to tune threshold values to resolve classification probabilities during inference. We demonstrate our method on a collection of variants of the 2021 Joint Entity and Relation Extraction model proposed by Eberts and Ulges.
title Interim Report on Human-Guided Adaptive Hyperparameter Optimization with Multi-Fidelity Sprints
topic Computation and Language
Machine Learning
url https://arxiv.org/abs/2505.09792