Saved in:
| Main Authors: | Sriraman, Ved, Block, Adam |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.05739 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
by: Huang, Audrey, et al.
Published: (2025)
by: Huang, Audrey, et al.
Published: (2025)
Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment
by: Hsu, Hsiang, et al.
Published: (2026)
by: Hsu, Hsiang, et al.
Published: (2026)
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024)
by: Qiu, Jiahao, et al.
Published: (2024)
Variational Best-of-N Alignment
by: Amini, Afra, et al.
Published: (2024)
by: Amini, Afra, et al.
Published: (2024)
EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes
by: Block, Adam, et al.
Published: (2025)
by: Block, Adam, et al.
Published: (2025)
AdaBoN: Adaptive Best-of-N Alignment
by: Raman, Vinod, et al.
Published: (2025)
by: Raman, Vinod, et al.
Published: (2025)
Real-Time Device Reach Forecasting Using HLL and MinHash Data Sketches
by: Muniyappa, Chandrashekar, et al.
Published: (2025)
by: Muniyappa, Chandrashekar, et al.
Published: (2025)
Learnable Chernoff Baselines for Inference-Time Alignment
by: Madhow, Sunil, et al.
Published: (2026)
by: Madhow, Sunil, et al.
Published: (2026)
Harnesses for Inference-Time Alignment over Execution Trajectories
by: Wang, Boyuan, et al.
Published: (2026)
by: Wang, Boyuan, et al.
Published: (2026)
Dynamic Search for Inference-Time Alignment in Diffusion Models
by: Li, Xiner, et al.
Published: (2025)
by: Li, Xiner, et al.
Published: (2025)
DAC-LoRA: Dynamic Adversarial Curriculum for Efficient and Robust Few-Shot Adaptation
by: Umrajkar, Ved
Published: (2025)
by: Umrajkar, Ved
Published: (2025)
Inference-Time Alignment of Diffusion Models with Direct Noise Optimization
by: Tang, Zhiwei, et al.
Published: (2024)
by: Tang, Zhiwei, et al.
Published: (2024)
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
by: Chow, Yinlam, et al.
Published: (2024)
by: Chow, Yinlam, et al.
Published: (2024)
Compute Aligned Training: Optimizing for Test Time Inference
by: Ousherovitch, Adam, et al.
Published: (2026)
by: Ousherovitch, Adam, et al.
Published: (2026)
Reward Shaping for Inference-Time Alignment: A Stackelberg Game Perspective
by: Wang, Haichuan, et al.
Published: (2026)
by: Wang, Haichuan, et al.
Published: (2026)
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
by: Jin, Luozhijie, et al.
Published: (2025)
by: Jin, Luozhijie, et al.
Published: (2025)
Best-of-N Jailbreaking
by: Hughes, John, et al.
Published: (2024)
by: Hughes, John, et al.
Published: (2024)
MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking
by: Zhao, Yizhou, et al.
Published: (2025)
by: Zhao, Yizhou, et al.
Published: (2025)
Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning
by: Foster, Dylan J., et al.
Published: (2024)
by: Foster, Dylan J., et al.
Published: (2024)
RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs
by: Geuter, Jonathan, et al.
Published: (2025)
by: Geuter, Jonathan, et al.
Published: (2025)
Revisiting the Superficial Alignment Hypothesis
by: Raghavendra, Mohit, et al.
Published: (2024)
by: Raghavendra, Mohit, et al.
Published: (2024)
Majority of the Bests: Improving Best-of-N via Bootstrapping
by: Rakhsha, Amin, et al.
Published: (2025)
by: Rakhsha, Amin, et al.
Published: (2025)
GaussMark: A Practical Approach for Structural Watermarking of Language Models
by: Block, Adam, et al.
Published: (2025)
by: Block, Adam, et al.
Published: (2025)
Best of mini-N in-loop Sampling: A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling
by: Rho, Hyung Gyu, et al.
Published: (2025)
by: Rho, Hyung Gyu, et al.
Published: (2025)
Active Learning via Regression Beyond Realizability
by: Ganju, Atul, et al.
Published: (2025)
by: Ganju, Atul, et al.
Published: (2025)
CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
by: Tang, Yung-Chen, et al.
Published: (2025)
by: Tang, Yung-Chen, et al.
Published: (2025)
Experience-Guided Adaptation of Inference-Time Reasoning Strategies
by: Stein, Adam, et al.
Published: (2025)
by: Stein, Adam, et al.
Published: (2025)
Learning Generative Selection for Best-of-N
by: Toshniwal, Shubham, et al.
Published: (2026)
by: Toshniwal, Shubham, et al.
Published: (2026)
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
by: Kanai, Sekitoshi, et al.
Published: (2025)
by: Kanai, Sekitoshi, et al.
Published: (2025)
PITA: Preference-Guided Inference-Time Alignment for LLM Post-Training
by: Bobbili, Sarat Chandra, et al.
Published: (2025)
by: Bobbili, Sarat Chandra, et al.
Published: (2025)
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?
by: Gu, Zhuojun, et al.
Published: (2025)
by: Gu, Zhuojun, et al.
Published: (2025)
Best-of-$\infty$ -- Asymptotic Performance of Test-Time LLM Ensembling
by: Komiyama, Junpei, et al.
Published: (2025)
by: Komiyama, Junpei, et al.
Published: (2025)
STEB: In Search of the Best Evaluation Approach for Synthetic Time Series
by: Stenger, Michael, et al.
Published: (2025)
by: Stenger, Michael, et al.
Published: (2025)
MarkovScale: Towards Optimal Sequential Scaling at Inference Time
by: Wang, Youkang, et al.
Published: (2026)
by: Wang, Youkang, et al.
Published: (2026)
BOND: Aligning LLMs with Best-of-N Distillation
by: Sessa, Pier Giuseppe, et al.
Published: (2024)
by: Sessa, Pier Giuseppe, et al.
Published: (2024)
GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
by: Yang, Ziwei, et al.
Published: (2024)
by: Yang, Ziwei, et al.
Published: (2024)
Temper and Tilt Lead to SLOP: Reward Hacking Mitigation with Inference-Time Alignment
by: Wang, Ye, et al.
Published: (2026)
by: Wang, Ye, et al.
Published: (2026)
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
by: Krishna, Kundan, et al.
Published: (2025)
by: Krishna, Kundan, et al.
Published: (2025)
Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review
by: Uehara, Masatoshi, et al.
Published: (2025)
by: Uehara, Masatoshi, et al.
Published: (2025)
Optimal Multi-Objective Best Arm Identification with Fixed Confidence
by: Chen, Zhirui, et al.
Published: (2025)
by: Chen, Zhirui, et al.
Published: (2025)
Similar Items
-
Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
by: Huang, Audrey, et al.
Published: (2025) -
Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment
by: Hsu, Hsiang, et al.
Published: (2026) -
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024) -
Variational Best-of-N Alignment
by: Amini, Afra, et al.
Published: (2024) -
EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes
by: Block, Adam, et al.
Published: (2025)