Saved in:
| Main Authors: | Agarwal, Mridul, Aggarwal, Vaneet, Quinn, Christopher J., Umrawal, Abhishek |
|---|---|
| Format: | Preprint |
| Published: |
2020
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2011.07687 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
by: Bai, Qinbo, et al.
Published: (2021)
by: Bai, Qinbo, et al.
Published: (2021)
Stochastic $k$-Submodular Bandits with Full Bandit Feedback
by: Nie, Guanyu, et al.
Published: (2024)
by: Nie, Guanyu, et al.
Published: (2024)
A Unified Approach for Maximizing Continuous DR-submodular Functions
by: Pedramfar, Mohammad, et al.
Published: (2023)
by: Pedramfar, Mohammad, et al.
Published: (2023)
CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
by: Liang, Yihao, et al.
Published: (2026)
by: Liang, Yihao, et al.
Published: (2026)
Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)
by: Pedramfar, Mohammad, et al.
Published: (2024)
A Resilience Framework for Bi-Criteria Combinatorial Optimization with Bandit Feedback
by: Aggarwal, Vaneet, et al.
Published: (2025)
by: Aggarwal, Vaneet, et al.
Published: (2025)
Content filtering methods for music recommendation: A review
by: Zeng, Terence, et al.
Published: (2025)
by: Zeng, Terence, et al.
Published: (2025)
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
by: Ganesh, Swetha, et al.
Published: (2025)
by: Ganesh, Swetha, et al.
Published: (2025)
Regret Analysis of Unichain Average Reward Constrained MDPs with General Parameterization
by: Satheesh, Anirudh, et al.
Published: (2026)
by: Satheesh, Anirudh, et al.
Published: (2026)
Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning
by: Ganesh, Swetha, et al.
Published: (2026)
by: Ganesh, Swetha, et al.
Published: (2026)
Improving Molecule Generation and Drug Discovery with a Knowledge-enhanced Generative Model
by: Malusare, Aditya, et al.
Published: (2024)
by: Malusare, Aditya, et al.
Published: (2024)
Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach
by: Xu, Yang, et al.
Published: (2025)
by: Xu, Yang, et al.
Published: (2025)
Persistent-Transient Policy Evaluation for Markov Chains via Minimal Peripheral Quotients
by: Xu, Yang, et al.
Published: (2026)
by: Xu, Yang, et al.
Published: (2026)
Sample Complexity Analysis for Constrained Bilevel Reinforcement Learning
by: Saxena, Naman, et al.
Published: (2026)
by: Saxena, Naman, et al.
Published: (2026)
A Unified Framework for Analyzing Meta-algorithms in Online Convex Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)
by: Pedramfar, Mohammad, et al.
Published: (2024)
$γ$-weakly $θ$-up-concavity: A Unified Framework for Non-Convex Optimization Beyond DR-Submodular and OSS Functions
by: Pedramfar, Mohammad, et al.
Published: (2026)
by: Pedramfar, Mohammad, et al.
Published: (2026)
From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)
by: Pedramfar, Mohammad, et al.
Published: (2024)
Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback
by: Pedramfar, Mohammad, et al.
Published: (2023)
by: Pedramfar, Mohammad, et al.
Published: (2023)
Sample-Efficient Constrained Reinforcement Learning with General Parameterization
by: Mondal, Washim Uddin, et al.
Published: (2024)
by: Mondal, Washim Uddin, et al.
Published: (2024)
Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)
by: Mondal, Washim Uddin, et al.
Published: (2024)
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
by: Mondal, Washim Uddin, et al.
Published: (2023)
by: Mondal, Washim Uddin, et al.
Published: (2023)
Distributionally Robust Self Paced Curriculum Reinforcement Learning
by: Satheesh, Anirudh, et al.
Published: (2025)
by: Satheesh, Anirudh, et al.
Published: (2025)
Contrastive Cross-Modal Learning for Infusing Chest X-ray Knowledge into ECGs
by: Punyamoorty, Vineet, et al.
Published: (2025)
by: Punyamoorty, Vineet, et al.
Published: (2025)
Hierarchical Deep Counterfactual Regret Minimization
by: Chen, Jiayu, et al.
Published: (2023)
by: Chen, Jiayu, et al.
Published: (2023)
Oracle-Robust Online Alignment for Large Language Models
by: Li, Zimeng, et al.
Published: (2026)
by: Li, Zimeng, et al.
Published: (2026)
Variational Offline Multi-agent Skill Discovery
by: Chen, Jiayu, et al.
Published: (2024)
by: Chen, Jiayu, et al.
Published: (2024)
Discrete State Diffusion Models: A Sample Complexity Perspective
by: Srikanth, Aadithya, et al.
Published: (2025)
by: Srikanth, Aadithya, et al.
Published: (2025)
Efficient $Q$-Learning and Actor-Critic Methods for Robust Average Reward Reinforcement Learning
by: Xu, Yang, et al.
Published: (2025)
by: Xu, Yang, et al.
Published: (2025)
Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
by: Ganesh, Swetha, et al.
Published: (2024)
by: Ganesh, Swetha, et al.
Published: (2024)
Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
by: Xu, Yang, et al.
Published: (2025)
by: Xu, Yang, et al.
Published: (2025)
A Sharper Global Convergence Analysis for Average Reward Reinforcement Learning via an Actor-Critic Approach
by: Ganesh, Swetha, et al.
Published: (2024)
by: Ganesh, Swetha, et al.
Published: (2024)
Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds
by: Zhang, Jiefu, et al.
Published: (2026)
by: Zhang, Jiefu, et al.
Published: (2026)
BAGEL: Projection-Free Algorithm for Adversarially Constrained Online Convex Optimization
by: Lu, Yiyang, et al.
Published: (2025)
by: Lu, Yiyang, et al.
Published: (2025)
Augmenting generative models with biomedical knowledge graphs improves targeted drug discovery
by: Malusare, Aditya, et al.
Published: (2025)
by: Malusare, Aditya, et al.
Published: (2025)
Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes
by: Ganguly, Bhargav, et al.
Published: (2023)
by: Ganguly, Bhargav, et al.
Published: (2023)
Multi-Agent Combinatorial-Multi-Armed-Bandit framework for the Submodular Welfare Problem under Bandit Feedback
by: Pokhriyal, Subham, et al.
Published: (2026)
by: Pokhriyal, Subham, et al.
Published: (2026)
Decentralized Projection-free Online Upper-Linearizable Optimization with Applications to DR-Submodular Optimization
by: Lu, Yiyang, et al.
Published: (2025)
by: Lu, Yiyang, et al.
Published: (2025)
Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention
by: Heuton, Kyle, et al.
Published: (2025)
by: Heuton, Kyle, et al.
Published: (2025)
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
by: Bai, Qinbo, et al.
Published: (2024)
by: Bai, Qinbo, et al.
Published: (2024)
Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms
by: Aggarwal, Vaneet, et al.
Published: (2024)
by: Aggarwal, Vaneet, et al.
Published: (2024)
Similar Items
-
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
by: Bai, Qinbo, et al.
Published: (2021) -
Stochastic $k$-Submodular Bandits with Full Bandit Feedback
by: Nie, Guanyu, et al.
Published: (2024) -
A Unified Approach for Maximizing Continuous DR-submodular Functions
by: Pedramfar, Mohammad, et al.
Published: (2023) -
CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
by: Liang, Yihao, et al.
Published: (2026) -
Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)