Saved in:
Bibliographic Details
Main Author: Salako, Joshua
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.03466
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918285445955584
author Salako, Joshua
author_facet Salako, Joshua
contents Scalability and data sparsity remain critical bottlenecks for collaborative filtering on massive interaction datasets. This work investigates the latent geometry of user preferences using the MovieLens 32M dataset, implementing a high-performance, parallelized Alternating Least Squares (ALS) framework. Through extensive hyperparameter optimization, we demonstrate that constrained low-rank models significantly outperform higher dimensional counterparts in generalization, achieving an optimal balance between Root Mean Square Error (RMSE) and ranking precision. We visualize the learned embedding space to reveal the unsupervised emergence of semantic genre clusters, confirming that the model captures deep structural relationships solely from interaction data. Finally, we validate the system's practical utility in a cold-start scenario, introducing a tunable scoring parameter to manage the trade-off between popularity bias and personalized affinity effectively. The codebase for this research can be found here: https://github.com/joshsalako/recommender.git
format Preprint
id arxiv_https___arxiv_org_abs_2601_03466
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Latent Geometry of Taste: Scalable Low-Rank Matrix Factorization for Recommender Systems
Salako, Joshua
Computer Vision and Pattern Recognition
Machine Learning
Scalability and data sparsity remain critical bottlenecks for collaborative filtering on massive interaction datasets. This work investigates the latent geometry of user preferences using the MovieLens 32M dataset, implementing a high-performance, parallelized Alternating Least Squares (ALS) framework. Through extensive hyperparameter optimization, we demonstrate that constrained low-rank models significantly outperform higher dimensional counterparts in generalization, achieving an optimal balance between Root Mean Square Error (RMSE) and ranking precision. We visualize the learned embedding space to reveal the unsupervised emergence of semantic genre clusters, confirming that the model captures deep structural relationships solely from interaction data. Finally, we validate the system's practical utility in a cold-start scenario, introducing a tunable scoring parameter to manage the trade-off between popularity bias and personalized affinity effectively. The codebase for this research can be found here: https://github.com/joshsalako/recommender.git
title Latent Geometry of Taste: Scalable Low-Rank Matrix Factorization for Recommender Systems
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2601.03466