Saved in:
Bibliographic Details
Main Authors: Alet, Ferran, Gehring, Clement, Lozano-Pérez, Tomás, Kawaguchi, Kenji, Tenenbaum, Joshua B., Kaelbling, Leslie Pack
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.21149
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913629511614464
author Alet, Ferran
Gehring, Clement
Lozano-Pérez, Tomás
Kawaguchi, Kenji
Tenenbaum, Joshua B.
Kaelbling, Leslie Pack
author_facet Alet, Ferran
Gehring, Clement
Lozano-Pérez, Tomás
Kawaguchi, Kenji
Tenenbaum, Joshua B.
Kaelbling, Leslie Pack
contents The field of Machine Learning has changed significantly since the 1970s. However, its most basic principle, Empirical Risk Minimization (ERM), remains unchanged. We propose Functional Risk Minimization~(FRM), a general framework where losses compare functions rather than outputs. This results in better performance in supervised, unsupervised, and RL experiments. In the FRM paradigm, for each data point $(x_i,y_i)$ there is function $f_{θ_i}$ that fits it: $y_i = f_{θ_i}(x_i)$. This allows FRM to subsume ERM for many common loss functions and to capture more realistic noise processes. We also show that FRM provides an avenue towards understanding generalization in the modern over-parameterized regime, as its objective can be framed as finding the simplest model that fits the training data.
format Preprint
id arxiv_https___arxiv_org_abs_2412_21149
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Functional Risk Minimization
Alet, Ferran
Gehring, Clement
Lozano-Pérez, Tomás
Kawaguchi, Kenji
Tenenbaum, Joshua B.
Kaelbling, Leslie Pack
Machine Learning
The field of Machine Learning has changed significantly since the 1970s. However, its most basic principle, Empirical Risk Minimization (ERM), remains unchanged. We propose Functional Risk Minimization~(FRM), a general framework where losses compare functions rather than outputs. This results in better performance in supervised, unsupervised, and RL experiments. In the FRM paradigm, for each data point $(x_i,y_i)$ there is function $f_{θ_i}$ that fits it: $y_i = f_{θ_i}(x_i)$. This allows FRM to subsume ERM for many common loss functions and to capture more realistic noise processes. We also show that FRM provides an avenue towards understanding generalization in the modern over-parameterized regime, as its objective can be framed as finding the simplest model that fits the training data.
title Functional Risk Minimization
topic Machine Learning
url https://arxiv.org/abs/2412.21149