Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Xu, Lily, Bondi, Elizabeth, Fang, Fei, Perrault, Andrew, Wang, Kai, Tambe, Milind
Formato:	Preprint
Publicado:	2020
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2009.06560
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866910423450648576
author	Xu, Lily Bondi, Elizabeth Fang, Fei Perrault, Andrew Wang, Kai Tambe, Milind
author_facet	Xu, Lily Bondi, Elizabeth Fang, Fei Perrault, Andrew Wang, Kai Tambe, Milind
contents	Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
format	Preprint
id	arxiv_https___arxiv_org_abs_2009_06560
institution	arXiv
publishDate	2020
record_format	arxiv
spellingShingle	Dual-Mandate Patrols: Multi-Armed Bandits for Green Security Xu, Lily Bondi, Elizabeth Fang, Fei Perrault, Andrew Wang, Kai Tambe, Milind Machine Learning Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
title	Dual-Mandate Patrols: Multi-Armed Bandits for Green Security
topic	Machine Learning
url	https://arxiv.org/abs/2009.06560

Ejemplares similares