Saved in:
Bibliographic Details
Main Author: Svensson, Valentine
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.04111
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911093865054208
author Svensson, Valentine
author_facet Svensson, Valentine
contents Negative binomial regression is essential for analyzing over-dispersed count data in in comparative studies, but parameter estimation becomes computationally challenging in large screens requiring millions of comparisons. We investigate using a pre-trained transformer to produce estimates of negative binomial regression parameters from observed count data, trained through synthetic data generation to learn to invert the process of generating counts from parameters. The transformer method achieved better parameter accuracy than maximum likelihood optimization while being 20 times faster. However, comparisons unexpectedly revealed that method of moment estimates performed as well as maximum likelihood optimization in accuracy, while being 1,000 times faster and producing better-calibrated and more powerful tests, making it the most efficient solution for this application.
format Preprint
id arxiv_https___arxiv_org_abs_2508_04111
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Negative binomial regression and inference using a pre-trained transformer
Svensson, Valentine
Machine Learning
Negative binomial regression is essential for analyzing over-dispersed count data in in comparative studies, but parameter estimation becomes computationally challenging in large screens requiring millions of comparisons. We investigate using a pre-trained transformer to produce estimates of negative binomial regression parameters from observed count data, trained through synthetic data generation to learn to invert the process of generating counts from parameters. The transformer method achieved better parameter accuracy than maximum likelihood optimization while being 20 times faster. However, comparisons unexpectedly revealed that method of moment estimates performed as well as maximum likelihood optimization in accuracy, while being 1,000 times faster and producing better-calibrated and more powerful tests, making it the most efficient solution for this application.
title Negative binomial regression and inference using a pre-trained transformer
topic Machine Learning
url https://arxiv.org/abs/2508.04111