Saved in:
Bibliographic Details
Main Authors: Ungar, Kevin, Oprean-Stan, Camelia
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.06587
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913644437045248
author Ungar, Kevin
Oprean-Stan, Camelia
author_facet Ungar, Kevin
Oprean-Stan, Camelia
contents This article presents a comprehensive methodology for processing financial datasets of Apple Inc., encompassing quarterly income and daily stock prices, spanning from March 31, 2009, to December 31, 2023. Leveraging 60 observations for quarterly income and 3774 observations for daily stock prices, sourced from Macrotrends and Yahoo Finance respectively, the study outlines five distinct datasets crafted through varied preprocessing techniques. Through detailed explanations of aggregation, interpolation (linear, polynomial, and cubic spline) and lagged variables methods, the study elucidates the steps taken to transform raw data into analytically rich datasets. Subsequently, the article delves into regression analysis, aiming to decipher which of the five data processing methods best suits capital market analysis, by employing both linear and polynomial regression models on each preprocessed dataset and evaluating their performance using a range of metrics, including cross-validation score, MSE, MAE, RMSE, R-squared, and Adjusted R-squared. The research findings reveal that linear interpolation with polynomial regression emerges as the top-performing method, boasting the lowest validation MSE and MAE values, alongside the highest R-squared and Adjusted R-squared values.
format Preprint
id arxiv_https___arxiv_org_abs_2501_06587
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices
Ungar, Kevin
Oprean-Stan, Camelia
Econometrics
This article presents a comprehensive methodology for processing financial datasets of Apple Inc., encompassing quarterly income and daily stock prices, spanning from March 31, 2009, to December 31, 2023. Leveraging 60 observations for quarterly income and 3774 observations for daily stock prices, sourced from Macrotrends and Yahoo Finance respectively, the study outlines five distinct datasets crafted through varied preprocessing techniques. Through detailed explanations of aggregation, interpolation (linear, polynomial, and cubic spline) and lagged variables methods, the study elucidates the steps taken to transform raw data into analytically rich datasets. Subsequently, the article delves into regression analysis, aiming to decipher which of the five data processing methods best suits capital market analysis, by employing both linear and polynomial regression models on each preprocessed dataset and evaluating their performance using a range of metrics, including cross-validation score, MSE, MAE, RMSE, R-squared, and Adjusted R-squared. The research findings reveal that linear interpolation with polynomial regression emerges as the top-performing method, boasting the lowest validation MSE and MAE values, alongside the highest R-squared and Adjusted R-squared values.
title Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices
topic Econometrics
url https://arxiv.org/abs/2501.06587