Saved in:
Bibliographic Details
Main Authors: Gómez, Román Salmerón, García, Catalina García
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.04330
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929744897900544
author Gómez, Román Salmerón
García, Catalina García
author_facet Gómez, Román Salmerón
García, Catalina García
contents This paper shows that the degree of approximate multicollinearity in a linear regression model increases simply by including independent variables, even if these are not highly linearly related. In the current situation where it is relatively easy to find linear models with a large number of independent variables, it is shown that this issue can lead to the erroneous conclusion that there is a worrying problem of approximate multicollinearity. To avoid this situation, an adjusted variance inflation factor is proposed to compensate the presence of a large number of independent variables in the multiple linear regression model. It is shown that this proposal has a direct impact on variable selection models based on influence relationships, which translates into a new decision criterion in the individual significance contrast to be considered in stepwise regression models or even directly in a multiple linear regression model.
format Preprint
id arxiv_https___arxiv_org_abs_2503_04330
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Stepwise regression revisited
Gómez, Román Salmerón
García, Catalina García
Methodology
This paper shows that the degree of approximate multicollinearity in a linear regression model increases simply by including independent variables, even if these are not highly linearly related. In the current situation where it is relatively easy to find linear models with a large number of independent variables, it is shown that this issue can lead to the erroneous conclusion that there is a worrying problem of approximate multicollinearity. To avoid this situation, an adjusted variance inflation factor is proposed to compensate the presence of a large number of independent variables in the multiple linear regression model. It is shown that this proposal has a direct impact on variable selection models based on influence relationships, which translates into a new decision criterion in the individual significance contrast to be considered in stepwise regression models or even directly in a multiple linear regression model.
title Stepwise regression revisited
topic Methodology
url https://arxiv.org/abs/2503.04330