Saved in:
| Main Authors: | Mukherjee, Dibyangshu, Kalyanakrishnan, Shivaram |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.18252 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Howard's Policy Iteration is Subexponential for Deterministic Markov Decision Problems with Rewards of Fixed Bit-size and Arbitrary Discount Factor
by: Mukherjee, Dibyangshu, et al.
Published: (2025)
by: Mukherjee, Dibyangshu, et al.
Published: (2025)
On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
by: Shah, Anvay, et al.
Published: (2026)
by: Shah, Anvay, et al.
Published: (2026)
Scaling Inference-Efficient Language Models
by: Bian, Song, et al.
Published: (2025)
by: Bian, Song, et al.
Published: (2025)
A View of the Certainty-Equivalence Method for PAC RL as an Application of the Trajectory Tree Method
by: Kalyanakrishnan, Shivaram, et al.
Published: (2025)
by: Kalyanakrishnan, Shivaram, et al.
Published: (2025)
Rao-Blackwellized POMDP Planning
by: Lee, Jiho, et al.
Published: (2024)
by: Lee, Jiho, et al.
Published: (2024)
Tesserae: Scalable Placement Policies for Deep Learning Workloads
by: Bian, Song, et al.
Published: (2025)
by: Bian, Song, et al.
Published: (2025)
Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
by: Bian, Song, et al.
Published: (2025)
by: Bian, Song, et al.
Published: (2025)
Performative Policy Gradient: Optimality in Performative Reinforcement Learning
by: Basu, Debabrota, et al.
Published: (2025)
by: Basu, Debabrota, et al.
Published: (2025)
Using Common Random Numbers for Simulation-based Planning with Rollouts
by: Yadav, Sandarbh, et al.
Published: (2026)
by: Yadav, Sandarbh, et al.
Published: (2026)
Evaluating CUDA Tile for AI Workloads on Hopper and Blackwell GPUs
by: Yadav, Divakar Kumar, et al.
Published: (2026)
by: Yadav, Divakar Kumar, et al.
Published: (2026)
Learning Optimal and Sample-Efficient Decision Policies with Guarantees
by: Shao, Daqian
Published: (2026)
by: Shao, Daqian
Published: (2026)
Learnable Game-theoretic Policy Optimization for Data-centric Self-explanation Rationalization
by: Zhao, Yunxiao, et al.
Published: (2025)
by: Zhao, Yunxiao, et al.
Published: (2025)
LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models
by: Chang, Tzu-Tao, et al.
Published: (2025)
by: Chang, Tzu-Tao, et al.
Published: (2025)
What Limits Agentic Systems Efficiency?
by: Bian, Song, et al.
Published: (2025)
by: Bian, Song, et al.
Published: (2025)
EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)
by: Cheng, Ruoxi, et al.
Published: (2025)
Implementing Rational Choice Functions with LLMs and Measuring their Alignment with User Preferences
by: Karnysheva, Anna, et al.
Published: (2025)
by: Karnysheva, Anna, et al.
Published: (2025)
ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition
by: Zhao, Muyang, et al.
Published: (2026)
by: Zhao, Muyang, et al.
Published: (2026)
PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
by: Ockerman, Seth, et al.
Published: (2025)
by: Ockerman, Seth, et al.
Published: (2025)
Step-level Optimization for Efficient Computer-use Agents
by: Wei, Jinbiao, et al.
Published: (2026)
by: Wei, Jinbiao, et al.
Published: (2026)
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
by: Chen, Hao Mark, et al.
Published: (2025)
by: Chen, Hao Mark, et al.
Published: (2025)
Test-time Verification via Optimal Transport: Coverage, ROC, & Sub-optimality
by: Mukherjee, Arpan, et al.
Published: (2025)
by: Mukherjee, Arpan, et al.
Published: (2025)
Latent Modulated Function for Computational Optimal Continuous Image Representation
by: He, Zongyao, et al.
Published: (2024)
by: He, Zongyao, et al.
Published: (2024)
Rationality Check! Benchmarking the Rationality of Large Language Models
by: Zhou, Zhilun, et al.
Published: (2025)
by: Zhou, Zhilun, et al.
Published: (2025)
Uncovering the Computational Ingredients of Human-Like Representations in LLMs
by: Studdiford, Zach, et al.
Published: (2025)
by: Studdiford, Zach, et al.
Published: (2025)
Partial Policy Gradients for RL in LLMs
by: Mathur, Puneet, et al.
Published: (2026)
by: Mathur, Puneet, et al.
Published: (2026)
POETS: Uncertainty-Aware LLM Optimization via Compute-Efficient Policy Ensembles
by: Menet, Nicolas, et al.
Published: (2026)
by: Menet, Nicolas, et al.
Published: (2026)
Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning
by: Liu, Zhishuai, et al.
Published: (2024)
by: Liu, Zhishuai, et al.
Published: (2024)
Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs
by: Knoop, Jonathan, et al.
Published: (2026)
by: Knoop, Jonathan, et al.
Published: (2026)
Optimal Policy Minimum Bayesian Risk
by: Astudillo, Ramón Fernandez, et al.
Published: (2025)
by: Astudillo, Ramón Fernandez, et al.
Published: (2025)
RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library
by: Wang, Jiapeng, et al.
Published: (2025)
by: Wang, Jiapeng, et al.
Published: (2025)
Value Functions for Temporal Logic: Optimal Policies and Safety Filters
by: So, Oswin, et al.
Published: (2026)
by: So, Oswin, et al.
Published: (2026)
Functional Critics Are Essential for Actor-Critic: From Off-Policy Stability to Efficient Exploration
by: Bai, Qinxun, et al.
Published: (2025)
by: Bai, Qinxun, et al.
Published: (2025)
The AI Policy Module: Developing Computer Science Student Competency in AI Ethics and Policy
by: Weichert, James, et al.
Published: (2025)
by: Weichert, James, et al.
Published: (2025)
Are Protein Language Models Compute Optimal?
by: Serrano, Yaiza, et al.
Published: (2024)
by: Serrano, Yaiza, et al.
Published: (2024)
RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning
by: Xu, Song, et al.
Published: (2025)
by: Xu, Song, et al.
Published: (2025)
The Function-Representation Model of Computation
by: Ibias, Alfredo, et al.
Published: (2024)
by: Ibias, Alfredo, et al.
Published: (2024)
An Optimistic-Robust Approach for Dynamic Positioning of Omnichannel Inventories
by: Harsha, Pavithra, et al.
Published: (2023)
by: Harsha, Pavithra, et al.
Published: (2023)
Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation
by: Zhao, Runze, et al.
Published: (2025)
by: Zhao, Runze, et al.
Published: (2025)
Tabula: Efficiently Computing Nonlinear Activation Functions for Secure Neural Network Inference
by: Lam, Maximilian, et al.
Published: (2022)
by: Lam, Maximilian, et al.
Published: (2022)
Rational Inverse Reasoning
by: Zandonati, Ben, et al.
Published: (2025)
by: Zandonati, Ben, et al.
Published: (2025)
Similar Items
-
Howard's Policy Iteration is Subexponential for Deterministic Markov Decision Problems with Rewards of Fixed Bit-size and Arbitrary Discount Factor
by: Mukherjee, Dibyangshu, et al.
Published: (2025) -
On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
by: Shah, Anvay, et al.
Published: (2026) -
Scaling Inference-Efficient Language Models
by: Bian, Song, et al.
Published: (2025) -
A View of the Certainty-Equivalence Method for PAC RL as an Application of the Trajectory Tree Method
by: Kalyanakrishnan, Shivaram, et al.
Published: (2025) -
Rao-Blackwellized POMDP Planning
by: Lee, Jiho, et al.
Published: (2024)