Saved in:
| Main Authors: | Ebihara, Akinori F., Miyagawa, Taiki, Sakurai, Kazuyuki, Imaoka, Hitoshi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.18059 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Accurate Evaluation of Quickest Changepoint Detectors via Non-parametric Survival Analysis
by: Miyagawa, Taiki, et al.
Published: (2026)
by: Miyagawa, Taiki, et al.
Published: (2026)
Rethinking the Backbone in Class Imbalanced Federated Source Free Domain Adaptation: The Utility of Vision Foundation Models
by: Kihara, Kosuke, et al.
Published: (2025)
by: Kihara, Kosuke, et al.
Published: (2025)
Federated Source-free Domain Adaptation for Classification: Weighted Cluster Aggregation for Unlabeled Data
by: Mori, Junki, et al.
Published: (2024)
by: Mori, Junki, et al.
Published: (2024)
ConSol: Sequential Probability Ratio Testing to Find Consistent LLM Reasoning Paths Efficiently
by: Lee, Jaeyeon, et al.
Published: (2025)
by: Lee, Jaeyeon, et al.
Published: (2025)
Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset
by: Oiso, Hideyuki, et al.
Published: (2024)
by: Oiso, Hideyuki, et al.
Published: (2024)
Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees
by: Miyagawa, Taiki, et al.
Published: (2024)
by: Miyagawa, Taiki, et al.
Published: (2024)
TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning
by: Nagle, Alliot, et al.
Published: (2026)
by: Nagle, Alliot, et al.
Published: (2026)
Statistical Early Stopping for Reasoning Models
by: Xie, Yangxinyu, et al.
Published: (2026)
by: Xie, Yangxinyu, et al.
Published: (2026)
Rethinking Early Stopping: Refine, Then Calibrate
by: Berta, Eugène, et al.
Published: (2025)
by: Berta, Eugène, et al.
Published: (2025)
Know When to Abstain: Optimal Selective Classification with Likelihood Ratios
by: Heng, Alvin, et al.
Published: (2025)
by: Heng, Alvin, et al.
Published: (2025)
S2O: Early Stopping for Sparse Attention via Online Permutation
by: Zhang, Yu, et al.
Published: (2026)
by: Zhang, Yu, et al.
Published: (2026)
Learning to Stop Overthinking at Test Time
by: Bao, Hieu Tran, et al.
Published: (2025)
by: Bao, Hieu Tran, et al.
Published: (2025)
ESPO: Early-Stopping Proximal Policy Optimization
by: Li, Zihang, et al.
Published: (2026)
by: Li, Zihang, et al.
Published: (2026)
Early Stopping for Large Reasoning Models via Confidence Dynamics
by: Hosseini, Parsa, et al.
Published: (2026)
by: Hosseini, Parsa, et al.
Published: (2026)
BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning
by: Li, Yuan, et al.
Published: (2026)
by: Li, Yuan, et al.
Published: (2026)
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
by: Farebrother, Jesse, et al.
Published: (2024)
by: Farebrother, Jesse, et al.
Published: (2024)
Random Policy Enables In-Context Reinforcement Learning within Trust Horizons
by: Chen, Weiqin, et al.
Published: (2024)
by: Chen, Weiqin, et al.
Published: (2024)
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation
by: Dai, Juntao, et al.
Published: (2024)
by: Dai, Juntao, et al.
Published: (2024)
BEACON: Bayesian Optimal Stopping for Efficient LLM Sampling
by: Wan, Guangya, et al.
Published: (2025)
by: Wan, Guangya, et al.
Published: (2025)
Don't Waste Your Time: Early Stopping Cross-Validation
by: Bergman, Edward, et al.
Published: (2024)
by: Bergman, Edward, et al.
Published: (2024)
Optimal Bayesian Stopping for Efficient Inference of Consistent LLM Answers
by: Huang, Jingkai, et al.
Published: (2026)
by: Huang, Jingkai, et al.
Published: (2026)
Learning Intrusion Prevention Policies through Optimal Stopping
by: Hammar, Kim, et al.
Published: (2021)
by: Hammar, Kim, et al.
Published: (2021)
Optimal Look-back Horizon for Time Series Forecasting in Federated Learning
by: Tang, Dahao, et al.
Published: (2025)
by: Tang, Dahao, et al.
Published: (2025)
LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
by: Yang, Chenghao, et al.
Published: (2025)
by: Yang, Chenghao, et al.
Published: (2025)
Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning
by: Niu, Xuecheng, et al.
Published: (2024)
by: Niu, Xuecheng, et al.
Published: (2024)
Thinking While Listening: Fast-Slow Recurrence for Long-Horizon Sequential Modeling
by: Takashiro, Shota, et al.
Published: (2026)
by: Takashiro, Shota, et al.
Published: (2026)
Intrusion Prevention through Optimal Stopping
by: Hammar, Kim, et al.
Published: (2021)
by: Hammar, Kim, et al.
Published: (2021)
A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning
by: Yoshihara, Hiroshi, et al.
Published: (2025)
by: Yoshihara, Hiroshi, et al.
Published: (2025)
Robust Deepfake Detection for Electronic Know Your Customer Systems Using Registered Images
by: Amada, Takuma, et al.
Published: (2025)
by: Amada, Takuma, et al.
Published: (2025)
Optimal Sequential Decision-Making in Geosteering: A Reinforcement Learning Approach
by: Muhammad, Ressi Bonti, et al.
Published: (2023)
by: Muhammad, Ressi Bonti, et al.
Published: (2023)
Scaling Optimal LR Across Token Horizons
by: Bjorck, Johan, et al.
Published: (2024)
by: Bjorck, Johan, et al.
Published: (2024)
Deep Learning-Based Hypoglycemia Classification Across Multiple Prediction Horizons
by: Cinar, Beyza, et al.
Published: (2025)
by: Cinar, Beyza, et al.
Published: (2025)
Cost-optimal Sequential Testing via Doubly Robust Q-learning
by: Zhou, Doudou, et al.
Published: (2026)
by: Zhou, Doudou, et al.
Published: (2026)
The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL
by: Li, Yingru, et al.
Published: (2026)
by: Li, Yingru, et al.
Published: (2026)
Personalized Federated Learning via Sequential Layer Expansion in Representation Learning
by: Jang, Jaewon, et al.
Published: (2024)
by: Jang, Jaewon, et al.
Published: (2024)
Horizon Generalization in Reinforcement Learning
by: Myers, Vivek, et al.
Published: (2025)
by: Myers, Vivek, et al.
Published: (2025)
Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation
by: Shao, Jie-Jing, et al.
Published: (2024)
by: Shao, Jie-Jing, et al.
Published: (2024)
Dynamic Angle Selection in X-Ray CT: A Reinforcement Learning Approach to Optimal Stopping
by: Wang, Tianyuan, et al.
Published: (2025)
by: Wang, Tianyuan, et al.
Published: (2025)
Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context
by: Chaudhry, Faris, et al.
Published: (2026)
by: Chaudhry, Faris, et al.
Published: (2026)
Can Machines Learn the True Probabilities?
by: Kim, Jinsook
Published: (2024)
by: Kim, Jinsook
Published: (2024)
Similar Items
-
Accurate Evaluation of Quickest Changepoint Detectors via Non-parametric Survival Analysis
by: Miyagawa, Taiki, et al.
Published: (2026) -
Rethinking the Backbone in Class Imbalanced Federated Source Free Domain Adaptation: The Utility of Vision Foundation Models
by: Kihara, Kosuke, et al.
Published: (2025) -
Federated Source-free Domain Adaptation for Classification: Weighted Cluster Aggregation for Unlabeled Data
by: Mori, Junki, et al.
Published: (2024) -
ConSol: Sequential Probability Ratio Testing to Find Consistent LLM Reasoning Paths Efficiently
by: Lee, Jaeyeon, et al.
Published: (2025) -
Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset
by: Oiso, Hideyuki, et al.
Published: (2024)