:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Agarwal, Mridul, Aggarwal, Vaneet, Quinn, Christopher J., Umrawal, Abhishek
Format:	Preprint
Published:	2020
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2011.07687
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
by: Bai, Qinbo, et al.
Published: (2021)

Stochastic $k$-Submodular Bandits with Full Bandit Feedback
by: Nie, Guanyu, et al.
Published: (2024)

A Unified Approach for Maximizing Continuous DR-submodular Functions
by: Pedramfar, Mohammad, et al.
Published: (2023)

CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
by: Liang, Yihao, et al.
Published: (2026)

Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)

A Resilience Framework for Bi-Criteria Combinatorial Optimization with Bandit Feedback
by: Aggarwal, Vaneet, et al.
Published: (2025)

Content filtering methods for music recommendation: A review
by: Zeng, Terence, et al.
Published: (2025)

Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
by: Ganesh, Swetha, et al.
Published: (2025)

Regret Analysis of Unichain Average Reward Constrained MDPs with General Parameterization
by: Satheesh, Anirudh, et al.
Published: (2026)

Breaking the Bias Barrier in Concave Multi-Objective Reinforcement Learning
by: Ganesh, Swetha, et al.
Published: (2026)

Improving Molecule Generation and Drug Discovery with a Knowledge-enhanced Generative Model
by: Malusare, Aditya, et al.
Published: (2024)

Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach
by: Xu, Yang, et al.
Published: (2025)

Persistent-Transient Policy Evaluation for Markov Chains via Minimal Peripheral Quotients
by: Xu, Yang, et al.
Published: (2026)

Sample Complexity Analysis for Constrained Bilevel Reinforcement Learning
by: Saxena, Naman, et al.
Published: (2026)

A Unified Framework for Analyzing Meta-algorithms in Online Convex Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)

$γ$-weakly $θ$-up-concavity: A Unified Framework for Non-Convex Optimization Beyond DR-Submodular and OSS Functions
by: Pedramfar, Mohammad, et al.
Published: (2026)

From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization
by: Pedramfar, Mohammad, et al.
Published: (2024)

Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback
by: Pedramfar, Mohammad, et al.
Published: (2023)

Sample-Efficient Constrained Reinforcement Learning with General Parameterization
by: Mondal, Washim Uddin, et al.
Published: (2024)

Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)

Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision Processes
by: Mondal, Washim Uddin, et al.
Published: (2023)

Distributionally Robust Self Paced Curriculum Reinforcement Learning
by: Satheesh, Anirudh, et al.
Published: (2025)

Contrastive Cross-Modal Learning for Infusing Chest X-ray Knowledge into ECGs
by: Punyamoorty, Vineet, et al.
Published: (2025)

Hierarchical Deep Counterfactual Regret Minimization
by: Chen, Jiayu, et al.
Published: (2023)

Oracle-Robust Online Alignment for Large Language Models
by: Li, Zimeng, et al.
Published: (2026)

Variational Offline Multi-agent Skill Discovery
by: Chen, Jiayu, et al.
Published: (2024)

Discrete State Diffusion Models: A Sample Complexity Perspective
by: Srikanth, Aadithya, et al.
Published: (2025)

Efficient $Q$-Learning and Actor-Critic Methods for Robust Average Reward Reinforcement Learning
by: Xu, Yang, et al.
Published: (2025)

Order-Optimal Regret with Novel Policy Gradient Approaches in Infinite-Horizon Average Reward MDPs
by: Ganesh, Swetha, et al.
Published: (2024)

Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning
by: Xu, Yang, et al.
Published: (2025)

A Sharper Global Convergence Analysis for Average Reward Reinforcement Learning via an Actor-Critic Approach
by: Ganesh, Swetha, et al.
Published: (2024)

Don't Freeze, Don't Crash: Extending the Safe Operating Range of Neural Navigation in Dense Crowds
by: Zhang, Jiefu, et al.
Published: (2026)

BAGEL: Projection-Free Algorithm for Adversarially Constrained Online Convex Optimization
by: Lu, Yiyang, et al.
Published: (2025)

Augmenting generative models with biomedical knowledge graphs improves targeted drug discovery
by: Malusare, Aditya, et al.
Published: (2025)

Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes
by: Ganguly, Bhargav, et al.
Published: (2023)

Multi-Agent Combinatorial-Multi-Armed-Bandit framework for the Submodular Welfare Problem under Bandit Feedback
by: Pokhriyal, Subham, et al.
Published: (2026)

Decentralized Projection-free Online Upper-Linearizable Optimization with Applications to DR-Submodular Optimization
by: Lu, Yiyang, et al.
Published: (2025)

Decision-aware training of spatiotemporal forecasting models to select a top K subset of sites for intervention
by: Heuton, Kyle, et al.
Published: (2025)

Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
by: Bai, Qinbo, et al.
Published: (2024)

Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms
by: Aggarwal, Vaneet, et al.
Published: (2024)