Saved in:
Bibliographic Details
Main Authors: Sojo, Rafael, Díaz-Rozo, Javier, Bielza, Concha, Larrañaga, Pedro
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.21997
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918422176071680
author Sojo, Rafael
Díaz-Rozo, Javier
Bielza, Concha
Larrañaga, Pedro
author_facet Sojo, Rafael
Díaz-Rozo, Javier
Bielza, Concha
Larrañaga, Pedro
contents This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks, the sparse binned kernel density estimation and the Fourier kernel density estimation. These two probability distributions address the curse of dimensionality, which typically impacts binned models, by using sparse tensors and restricting the number of parent nodes in conditional probability calculations. To evaluate the proposal, we perform a complexity analysis and conduct several comparative experiments using synthetic data and datasets from the UCI Machine Learning repository. The experiments include different binning rules, parent restrictions, grid sizes, and number of instances to get a holistic view of the model's behavior. As a result, our binned semiparametric Bayesian networks achieve structural learning and log-likelihood estimations with no statistically significant differences compared to the semiparametric Bayesian networks, but at a much higher speed. Thus, the new binned semiparametric Bayesian networks prove to be a reliable and more efficient alternative to their non-binned counterparts.
format Preprint
id arxiv_https___arxiv_org_abs_2506_21997
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Binned semiparametric Bayesian networks for efficient kernel density estimation
Sojo, Rafael
Díaz-Rozo, Javier
Bielza, Concha
Larrañaga, Pedro
Machine Learning
Artificial Intelligence
I.2.6; I.5.1; G.3
This paper introduces a new type of probabilistic semiparametric model that takes advantage of data binning to reduce the computational cost of kernel density estimation in nonparametric distributions. Two new conditional probability distributions are developed for the new binned semiparametric Bayesian networks, the sparse binned kernel density estimation and the Fourier kernel density estimation. These two probability distributions address the curse of dimensionality, which typically impacts binned models, by using sparse tensors and restricting the number of parent nodes in conditional probability calculations. To evaluate the proposal, we perform a complexity analysis and conduct several comparative experiments using synthetic data and datasets from the UCI Machine Learning repository. The experiments include different binning rules, parent restrictions, grid sizes, and number of instances to get a holistic view of the model's behavior. As a result, our binned semiparametric Bayesian networks achieve structural learning and log-likelihood estimations with no statistically significant differences compared to the semiparametric Bayesian networks, but at a much higher speed. Thus, the new binned semiparametric Bayesian networks prove to be a reliable and more efficient alternative to their non-binned counterparts.
title Binned semiparametric Bayesian networks for efficient kernel density estimation
topic Machine Learning
Artificial Intelligence
I.2.6; I.5.1; G.3
url https://arxiv.org/abs/2506.21997