Saved in:
| Main Authors: | Cuonzo, Simone, Deliu, Nina |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.09850 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Artificial Intelligence-based Decision Support Systems for Precision and Digital Health
by: Deliu, Nina, et al.
Published: (2024)
by: Deliu, Nina, et al.
Published: (2024)
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions
by: Deliu, Nina, et al.
Published: (2022)
by: Deliu, Nina, et al.
Published: (2022)
Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study
by: Liu, Xueqing, et al.
Published: (2023)
by: Liu, Xueqing, et al.
Published: (2023)
The Interplay between Bayesian Inference and Conformal Prediction
by: Deliu, Nina, et al.
Published: (2025)
by: Deliu, Nina, et al.
Published: (2025)
Conformal-Style Quantile Analyses for Stochastic Bandits
by: Du, Chengyu, et al.
Published: (2026)
by: Du, Chengyu, et al.
Published: (2026)
Stochastic Online Conformal Prediction with Semi-Bandit Feedback
by: Ge, Haosen, et al.
Published: (2024)
by: Ge, Haosen, et al.
Published: (2024)
Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation
by: He, Jianliang, et al.
Published: (2024)
by: He, Jianliang, et al.
Published: (2024)
Using Adaptive Bandit Experiments to Increase and Investigate Engagement in Mental Health
by: Kumar, Harsh, et al.
Published: (2023)
by: Kumar, Harsh, et al.
Published: (2023)
Online Conformal Abstention for Factuality Control Under Adversarial Bandit Feedback
by: Lee, Minjae, et al.
Published: (2025)
by: Lee, Minjae, et al.
Published: (2025)
Preference-centric Bandits: Optimality of Mixtures and Regret-efficient Algorithms
by: Tatlı, Meltem, et al.
Published: (2025)
by: Tatlı, Meltem, et al.
Published: (2025)
Unified theory of upper confidence bound policies for bandit problems targeting total reward, maximal reward, and more
by: Kikkawa, Nobuaki, et al.
Published: (2024)
by: Kikkawa, Nobuaki, et al.
Published: (2024)
Risk-sensitive Bandits: Arm Mixture Optimality and Regret-efficient Algorithms
by: Tatlı, Meltem, et al.
Published: (2025)
by: Tatlı, Meltem, et al.
Published: (2025)
Improving the statistical efficiency of cross-conformal prediction
by: Gasparin, Matteo, et al.
Published: (2025)
by: Gasparin, Matteo, et al.
Published: (2025)
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
by: Liu, Shih-Yang, et al.
Published: (2026)
by: Liu, Shih-Yang, et al.
Published: (2026)
Self-rewarding correction for mathematical reasoning
by: Xiong, Wei, et al.
Published: (2025)
by: Xiong, Wei, et al.
Published: (2025)
Noise-based reward-modulated learning
by: Fernández, Jesús García, et al.
Published: (2025)
by: Fernández, Jesús García, et al.
Published: (2025)
Active teacher selection for reward learning
by: Freedman, Rachel, et al.
Published: (2023)
by: Freedman, Rachel, et al.
Published: (2023)
Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards
by: Mussi, Marco, et al.
Published: (2024)
by: Mussi, Marco, et al.
Published: (2024)
Trading off rewards and errors in multi-armed bandits
by: Erraqabi, Akram, et al.
Published: (2026)
by: Erraqabi, Akram, et al.
Published: (2026)
Meta Flow Maps enable scalable reward alignment
by: Potaptchik, Peter, et al.
Published: (2026)
by: Potaptchik, Peter, et al.
Published: (2026)
A statistical perspective on transformers for small longitudinal cohort data
by: Farhadyar, Kiana, et al.
Published: (2026)
by: Farhadyar, Kiana, et al.
Published: (2026)
Bringing Federated Learning to Space
by: Kim, Grace, et al.
Published: (2025)
by: Kim, Grace, et al.
Published: (2025)
Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization
by: Li, Tong, et al.
Published: (2021)
by: Li, Tong, et al.
Published: (2021)
The impact of intrinsic rewards on exploration in Reinforcement Learning
by: Kayal, Aya, et al.
Published: (2025)
by: Kayal, Aya, et al.
Published: (2025)
LIRE: listwise reward enhancement for preference alignment
by: Zhu, Mingye, et al.
Published: (2024)
by: Zhu, Mingye, et al.
Published: (2024)
Optimistic Q-learning for average reward and episodic reinforcement learning
by: Agrawal, Priyank, et al.
Published: (2024)
by: Agrawal, Priyank, et al.
Published: (2024)
Covariance-adapting algorithm for semi-bandits with application to sparse rewards
by: Perrault, Pierre, et al.
Published: (2026)
by: Perrault, Pierre, et al.
Published: (2026)
Scale-free adaptive planning for deterministic dynamics & discounted rewards
by: Bartlett, Peter L., et al.
Published: (2026)
by: Bartlett, Peter L., et al.
Published: (2026)
Continuously evolving rewards in an open-ended environment
by: Bailey, Richard M.
Published: (2024)
by: Bailey, Richard M.
Published: (2024)
Generalized Kernelized Bandits: A Novel Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds
by: Metelli, Alberto Maria, et al.
Published: (2025)
by: Metelli, Alberto Maria, et al.
Published: (2025)
EVAL: EigenVector-based Average-reward Learning
by: Adamczyk, Jacob, et al.
Published: (2025)
by: Adamczyk, Jacob, et al.
Published: (2025)
Streaming Looking Ahead with Token-level Self-reward
by: Zhang, Hongming, et al.
Published: (2025)
by: Zhang, Hongming, et al.
Published: (2025)
Physics-based reward driven image analysis in microscopy
by: Barakati, Kamyar, et al.
Published: (2024)
by: Barakati, Kamyar, et al.
Published: (2024)
Episodic Reinforcement Learning with Expanded State-reward Space
by: Liang, Dayang, et al.
Published: (2024)
by: Liang, Dayang, et al.
Published: (2024)
Machine learning-based optimization workflow of the homogeneity of spunbond nonwovens with human validation
by: Victor, Viny Saajan, et al.
Published: (2024)
by: Victor, Viny Saajan, et al.
Published: (2024)
Conformal Prediction: a Unified Review of Theory and New Challenges
by: Fontana, Matteo, et al.
Published: (2020)
by: Fontana, Matteo, et al.
Published: (2020)
Leveraging heterogeneous spillover in maximizing contextual bandit rewards
by: Faruk, Ahmed Sayeed, et al.
Published: (2023)
by: Faruk, Ahmed Sayeed, et al.
Published: (2023)
BanditQ: Fair Bandits with Guaranteed Rewards
by: Sinha, Abhishek
Published: (2023)
by: Sinha, Abhishek
Published: (2023)
Transformer models as an efficient replacement for statistical test suites to evaluate the quality of random numbers
by: Goel, Rishabh, et al.
Published: (2024)
by: Goel, Rishabh, et al.
Published: (2024)
TractOracle: towards an anatomically-informed reward function for RL-based tractography
by: Théberge, Antoine, et al.
Published: (2024)
by: Théberge, Antoine, et al.
Published: (2024)
Similar Items
-
Artificial Intelligence-based Decision Support Systems for Precision and Digital Health
by: Deliu, Nina, et al.
Published: (2024) -
Reinforcement Learning in Modern Biostatistics: Constructing Optimal Adaptive Interventions
by: Deliu, Nina, et al.
Published: (2022) -
Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study
by: Liu, Xueqing, et al.
Published: (2023) -
The Interplay between Bayesian Inference and Conformal Prediction
by: Deliu, Nina, et al.
Published: (2025) -
Conformal-Style Quantile Analyses for Stochastic Bandits
by: Du, Chengyu, et al.
Published: (2026)