:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cuonzo, Simone, Deliu, Nina
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2512.09850
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Artificial Intelligence-based Decision Support Systems for Precision and Digital Health
by: Deliu, Nina, et al.
Published: (2024)

Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions
by: Deliu, Nina, et al.
Published: (2022)

Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study
by: Liu, Xueqing, et al.
Published: (2023)

The Interplay between Bayesian Inference and Conformal Prediction
by: Deliu, Nina, et al.
Published: (2025)

Conformal-Style Quantile Analyses for Stochastic Bandits
by: Du, Chengyu, et al.
Published: (2026)

Stochastic Online Conformal Prediction with Semi-Bandit Feedback
by: Ge, Haosen, et al.
Published: (2024)

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024)

Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health
by: Kumar, Harsh, et al.
Published: (2023)

Online Conformal Abstention for Factuality Control Under Adversarial Bandit Feedback
by: Lee, Minjae, et al.
Published: (2025)

Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms
by: Tatlı, Meltem, et al.
Published: (2025)

Unified theory of upper confidence bound policies for bandit problems targeting total reward, maximal reward, and more
by: Kikkawa, Nobuaki, et al.
Published: (2024)

Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms
by: Tatlı, Meltem, et al.
Published: (2025)

Improving the statistical efficiency of cross-conformal prediction
by: Gasparin, Matteo, et al.
Published: (2025)

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
by: Liu, Shih-Yang, et al.
Published: (2026)

Self-rewarding correction for mathematical reasoning
by: Xiong, Wei, et al.
Published: (2025)

Noise-based reward-modulated learning
by: Fernández, Jesús García, et al.
Published: (2025)

Active teacher selection for reward learning
by: Freedman, Rachel, et al.
Published: (2023)

Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards
by: Mussi, Marco, et al.
Published: (2024)

Trading off rewards and errors in multi-armed bandits
by: Erraqabi, Akram, et al.
Published: (2026)

Meta Flow Maps enable scalable reward alignment
by: Potaptchik, Peter, et al.
Published: (2026)

A statistical perspective on transformers for small longitudinal cohort data
by: Farhadyar, Kiana, et al.
Published: (2026)

Bringing Federated Learning to Space
by: Kim, Grace, et al.
Published: (2025)

Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization
by: Li, Tong, et al.
Published: (2021)

The impact of intrinsic rewards on exploration in Reinforcement Learning
by: Kayal, Aya, et al.
Published: (2025)

LIRE: listwise reward enhancement for preference alignment
by: Zhu, Mingye, et al.
Published: (2024)

Optimistic Q-learning for average reward and episodic reinforcement learning
by: Agrawal, Priyank, et al.
Published: (2024)

Covariance-adapting algorithm for semi-bandits with application to sparse rewards
by: Perrault, Pierre, et al.
Published: (2026)

Scale-free adaptive planning for deterministic dynamics & discounted rewards
by: Bartlett, Peter L., et al.
Published: (2026)

Continuously evolving rewards in an open-ended environment
by: Bailey, Richard M.
Published: (2024)

Generalized Kernelized Bandits: A Novel Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds
by: Metelli, Alberto Maria, et al.
Published: (2025)

EVAL: EigenVector-based Average-reward Learning
by: Adamczyk, Jacob, et al.
Published: (2025)

Streaming Looking Ahead with Token-level Self-reward
by: Zhang, Hongming, et al.
Published: (2025)

Physics-based reward driven image analysis in microscopy
by: Barakati, Kamyar, et al.
Published: (2024)

Episodic Reinforcement Learning with Expanded State-reward Space
by: Liang, Dayang, et al.
Published: (2024)

Machine learning-based optimization workflow of the homogeneity of spunbond nonwovens with human validation
by: Victor, Viny Saajan, et al.
Published: (2024)

Conformal Prediction: a Unified Review of Theory and New Challenges
by: Fontana, Matteo, et al.
Published: (2020)

Leveraging heterogeneous spillover in maximizing contextual bandit rewards
by: Faruk, Ahmed Sayeed, et al.
Published: (2023)

BanditQ: Fair Bandits with Guaranteed Rewards
by: Sinha, Abhishek
Published: (2023)

Transformer models as an efficient replacement for statistical test suites to evaluate the quality of random numbers
by: Goel, Rishabh, et al.
Published: (2024)

TractOracle: towards an anatomically-informed reward function for RL-based tractography
by: Théberge, Antoine, et al.
Published: (2024)