:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sriraman, Ved, Block, Adam
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.05739
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
by: Huang, Audrey, et al.
Published: (2025)

Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment
by: Hsu, Hsiang, et al.
Published: (2026)

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024)

Variational Best-of-N Alignment
by: Amini, Afra, et al.
Published: (2024)

EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes
by: Block, Adam, et al.
Published: (2025)

AdaBoN: Adaptive Best-of-N Alignment
by: Raman, Vinod, et al.
Published: (2025)

Real-Time Device Reach Forecasting Using HLL and MinHash Data Sketches
by: Muniyappa, Chandrashekar, et al.
Published: (2025)

Learnable Chernoff Baselines for Inference-Time Alignment
by: Madhow, Sunil, et al.
Published: (2026)

Harnesses for Inference-Time Alignment over Execution Trajectories
by: Wang, Boyuan, et al.
Published: (2026)

Dynamic Search for Inference-Time Alignment in Diffusion Models
by: Li, Xiner, et al.
Published: (2025)

DAC-LoRA: Dynamic Adversarial Curriculum for Efficient and Robust Few-Shot Adaptation
by: Umrajkar, Ved
Published: (2025)

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization
by: Tang, Zhiwei, et al.
Published: (2024)

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
by: Chow, Yinlam, et al.
Published: (2024)

Compute Aligned Training: Optimizing for Test Time Inference
by: Ousherovitch, Adam, et al.
Published: (2026)

Reward Shaping for Inference-Time Alignment: A Stackelberg Game Perspective
by: Wang, Haichuan, et al.
Published: (2026)

Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
by: Jin, Luozhijie, et al.
Published: (2025)

Best-of-N Jailbreaking
by: Hughes, John, et al.
Published: (2024)

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking
by: Zhao, Yizhou, et al.
Published: (2025)

Is Behavior Cloning All You Need? Understanding Horizon in Imitation Learning
by: Foster, Dylan J., et al.
Published: (2024)

RoBoN: Routed Online Best-of-n for Test-Time Scaling with Multiple LLMs
by: Geuter, Jonathan, et al.
Published: (2025)

Revisiting the Superficial Alignment Hypothesis
by: Raghavendra, Mohit, et al.
Published: (2024)

Majority of the Bests: Improving Best-of-N via Bootstrapping
by: Rakhsha, Amin, et al.
Published: (2025)

GaussMark: A Practical Approach for Structural Watermarking of Language Models
by: Block, Adam, et al.
Published: (2025)

Best of mini-N in-loop Sampling: A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling
by: Rho, Hyung Gyu, et al.
Published: (2025)

Active Learning via Regression Beyond Realizability
by: Ganju, Atul, et al.
Published: (2025)

CarBoN: Calibrated Best-of-N Sampling Improves Test-time Reasoning
by: Tang, Yung-Chen, et al.
Published: (2025)

Experience-Guided Adaptation of Inference-Time Reasoning Strategies
by: Stein, Adam, et al.
Published: (2025)

Learning Generative Selection for Best-of-N
by: Toshniwal, Shubham, et al.
Published: (2026)

Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
by: Kanai, Sekitoshi, et al.
Published: (2025)

PITA: Preference-Guided Inference-Time Alignment for LLM Post-Training
by: Bobbili, Sarat Chandra, et al.
Published: (2025)

Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?
by: Gu, Zhuojun, et al.
Published: (2025)

Best-of-$\infty$ -- Asymptotic Performance of Test-Time LLM Ensembling
by: Komiyama, Junpei, et al.
Published: (2025)

STEB: In Search of the Best Evaluation Approach for Synthetic Time Series
by: Stenger, Michael, et al.
Published: (2025)

MarkovScale: Towards Optimal Sequential Scaling at Inference Time
by: Wang, Youkang, et al.
Published: (2026)

BOND: Aligning LLMs with Best-of-N Distillation
by: Sessa, Pier Giuseppe, et al.
Published: (2024)

GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation
by: Yang, Ziwei, et al.
Published: (2024)

Temper and Tilt Lead to SLOP: Reward Hacking Mitigation with Inference-Time Alignment
by: Wang, Ye, et al.
Published: (2026)

Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
by: Krishna, Kundan, et al.
Published: (2025)

Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review
by: Uehara, Masatoshi, et al.
Published: (2025)

Optimal Multi-Objective Best Arm Identification with Fixed Confidence
by: Chen, Zhirui, et al.
Published: (2025)