Bewaard in:
Bibliografische gegevens
Hoofdauteurs: Nwaigwe, Dwight, Rychlik, Marek
Formaat: Preprint
Gepubliceerd in: 2020
Onderwerpen:
Online toegang:https://arxiv.org/abs/2012.04576
Tags: Voeg label toe
Geen labels, Wees de eerste die dit record labelt!
_version_ 1866916238247067648
author Nwaigwe, Dwight
Rychlik, Marek
author_facet Nwaigwe, Dwight
Rychlik, Marek
contents We revisit the problem of the existence of the maximum likelihood estimate for multi-class logistic regression. We show that one method of ensuring its existence is by assigning positive probability to every class in the sample dataset. The notion of data separability is not needed, which is in contrast to the classical set up of multi-class logistic regression in which each data sample belongs to one class. We also provide a general and constructive estimate of the convergence rate to the maximum likelihood estimate when gradient descent is used as the optimizer. Our estimate involves bounding the condition number of the Hessian of the maximum likelihood function. The approaches used in this article rely on a simple operator-theoretic framework.
format Preprint
id arxiv_https___arxiv_org_abs_2012_04576
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle On the existence of the maximum likelihood estimate and convergence rate under gradient descent for multi-class logistic regression
Nwaigwe, Dwight
Rychlik, Marek
Machine Learning
Statistics Theory
62J12, 65K10, 47N10
We revisit the problem of the existence of the maximum likelihood estimate for multi-class logistic regression. We show that one method of ensuring its existence is by assigning positive probability to every class in the sample dataset. The notion of data separability is not needed, which is in contrast to the classical set up of multi-class logistic regression in which each data sample belongs to one class. We also provide a general and constructive estimate of the convergence rate to the maximum likelihood estimate when gradient descent is used as the optimizer. Our estimate involves bounding the condition number of the Hessian of the maximum likelihood function. The approaches used in this article rely on a simple operator-theoretic framework.
title On the existence of the maximum likelihood estimate and convergence rate under gradient descent for multi-class logistic regression
topic Machine Learning
Statistics Theory
62J12, 65K10, 47N10
url https://arxiv.org/abs/2012.04576