Saved in:
| Main Author: | Shi, Chengchun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.16195 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reinforcement Learning from Human Feedback: A Statistical Perspective
by: Liu, Pangpang, et al.
Published: (2026)
by: Liu, Pangpang, et al.
Published: (2026)
Counterfactually Safe Reinforcement Learning
by: Li, Jingyi, et al.
Published: (2026)
by: Li, Jingyi, et al.
Published: (2026)
Dual Active Learning for Reinforcement Learning from Human Feedback
by: Liu, Pangpang, et al.
Published: (2024)
by: Liu, Pangpang, et al.
Published: (2024)
Sequential Knockoffs for Variable Selection in Reinforcement Learning
by: Ma, Tao, et al.
Published: (2023)
by: Ma, Tao, et al.
Published: (2023)
Testing Stationarity and Change Point Detection in Reinforcement Learning
by: Li, Mengbing, et al.
Published: (2022)
by: Li, Mengbing, et al.
Published: (2022)
Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data
by: Wang, Danyang, et al.
Published: (2024)
by: Wang, Danyang, et al.
Published: (2024)
Doubly Inhomogeneous Reinforcement Learning
by: Hu, Liyuan, et al.
Published: (2022)
by: Hu, Liyuan, et al.
Published: (2022)
Designing Time Series Experiments in A/B Testing with Transformer Reinforcement Learning
by: Wu, Xiangkun, et al.
Published: (2026)
by: Wu, Xiangkun, et al.
Published: (2026)
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings
by: Shi, C., et al.
Published: (2020)
by: Shi, C., et al.
Published: (2020)
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
by: Yu, Shuguang, et al.
Published: (2024)
by: Yu, Shuguang, et al.
Published: (2024)
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning
by: Ye, Kai, et al.
Published: (2025)
by: Ye, Kai, et al.
Published: (2025)
Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning
by: Gong, Shijin, et al.
Published: (2026)
by: Gong, Shijin, et al.
Published: (2026)
From Authors to Reviewers: Leveraging Rankings to Improve Peer Review
by: Wang, Weichen, et al.
Published: (2025)
by: Wang, Weichen, et al.
Published: (2025)
Semi-pessimistic Reinforcement Learning
by: Zhu, Jin, et al.
Published: (2025)
by: Zhu, Jin, et al.
Published: (2025)
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
by: Zhu, Jin, et al.
Published: (2023)
by: Zhu, Jin, et al.
Published: (2023)
Double Fairness Policy Learning: Integrating Action Fairness and Outcome Fairness in Decision-making
by: Bian, Zeyu, et al.
Published: (2026)
by: Bian, Zeyu, et al.
Published: (2026)
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing
by: Wang, Jitao, et al.
Published: (2025)
by: Wang, Jitao, et al.
Published: (2025)
Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic
by: Zhou, Hongyi, et al.
Published: (2026)
by: Zhou, Hongyi, et al.
Published: (2026)
Statistical Test for Feature Selection Pipelines by Selective Inference
by: Shiraishi, Tomohiro, et al.
Published: (2024)
by: Shiraishi, Tomohiro, et al.
Published: (2024)
Statistical Inference with Limited Memory: A Survey
by: Berg, Tomer, et al.
Published: (2023)
by: Berg, Tomer, et al.
Published: (2023)
ReDiF: Reinforced Distillation for Few Step Diffusion
by: Tighkhorshid, Amirhossein, et al.
Published: (2025)
by: Tighkhorshid, Amirhossein, et al.
Published: (2025)
Infrared Spectra Prediction for Diazo Groups Utilizing a Machine Learning Approach with Structural Attention Mechanism
by: Liu, Chengchun, et al.
Published: (2024)
by: Liu, Chengchun, et al.
Published: (2024)
Generalized Fitted Q-Iteration with Clustered Data
by: Hu, Liyuan, et al.
Published: (2025)
by: Hu, Liyuan, et al.
Published: (2025)
Statistical Reinforcement Learning in the Real World: A Survey of Challenges and Future Directions
by: Gazi, Asim H., et al.
Published: (2026)
by: Gazi, Asim H., et al.
Published: (2026)
Statistical Testing Framework for Clustering Pipelines by Selective Inference
by: Miyata, Yugo, et al.
Published: (2026)
by: Miyata, Yugo, et al.
Published: (2026)
Statistical Test for Auto Feature Engineering by Selective Inference
by: Matsukawa, Tatsuya, et al.
Published: (2024)
by: Matsukawa, Tatsuya, et al.
Published: (2024)
Off-policy Evaluation in Doubly Inhomogeneous Environments
by: Bian, Zeyu, et al.
Published: (2023)
by: Bian, Zeyu, et al.
Published: (2023)
A Two-armed Bandit Framework for A/B Testing
by: Wang, Jinjuan, et al.
Published: (2025)
by: Wang, Jinjuan, et al.
Published: (2025)
Perturbation is All You Need for Extrapolating Language Models
by: Cen, Zetai, et al.
Published: (2026)
by: Cen, Zetai, et al.
Published: (2026)
Learning Perturbations to Extrapolate Your LLM
by: Cen, Zetai, et al.
Published: (2026)
by: Cen, Zetai, et al.
Published: (2026)
Statistical Inference for Sequential Feature Selection after Domain Adaptation
by: Loc, Duong Tan, et al.
Published: (2025)
by: Loc, Duong Tan, et al.
Published: (2025)
Deep Distributional Learning with Non-crossing Quantile Network
by: Shen, Guohao, et al.
Published: (2025)
by: Shen, Guohao, et al.
Published: (2025)
Learning U-Statistics with Active Inference
by: Wang, Xiaoning, et al.
Published: (2026)
by: Wang, Xiaoning, et al.
Published: (2026)
Detecting LLM-Generated Text with Performance Guarantees
by: Zhou, Hongyi, et al.
Published: (2026)
by: Zhou, Hongyi, et al.
Published: (2026)
Unraveling the Interplay between Carryover Effects and Reward Autocorrelations in Switchback Experiments
by: Wen, Qianglin, et al.
Published: (2024)
by: Wen, Qianglin, et al.
Published: (2024)
Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text
by: Zhou, Hongyi, et al.
Published: (2026)
by: Zhou, Hongyi, et al.
Published: (2026)
Overcoming Selection Bias in Statistical Studies With Amortized Bayesian Inference
by: Arruda, Jonas, et al.
Published: (2026)
by: Arruda, Jonas, et al.
Published: (2026)
Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning
by: Behnamnia, Armin, et al.
Published: (2025)
by: Behnamnia, Armin, et al.
Published: (2025)
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees
by: Zhou, Hongyi, et al.
Published: (2025)
by: Zhou, Hongyi, et al.
Published: (2025)
A Survey of In-Context Reinforcement Learning
by: Moeini, Amir, et al.
Published: (2025)
by: Moeini, Amir, et al.
Published: (2025)
Similar Items
-
Reinforcement Learning from Human Feedback: A Statistical Perspective
by: Liu, Pangpang, et al.
Published: (2026) -
Counterfactually Safe Reinforcement Learning
by: Li, Jingyi, et al.
Published: (2026) -
Dual Active Learning for Reinforcement Learning from Human Feedback
by: Liu, Pangpang, et al.
Published: (2024) -
Sequential Knockoffs for Variable Selection in Reinforcement Learning
by: Ma, Tao, et al.
Published: (2023) -
Testing Stationarity and Change Point Detection in Reinforcement Learning
by: Li, Mengbing, et al.
Published: (2022)