Home > Research > Publications & Outputs > Monotone policies and indexability for bi-direc...
View graph of relations

Monotone policies and indexability for bi-directional restless bandits

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Monotone policies and indexability for bi-directional restless bandits. / Glazebrook, Kevin; Hodge, D. J.; Kirkbride, Christopher.
In: Advances in Applied Probability, Vol. 45, No. 1, 03.2013, p. 51-85.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Glazebrook K, Hodge DJ, Kirkbride C. Monotone policies and indexability for bi-directional restless bandits. Advances in Applied Probability. 2013 Mar;45(1):51-85. doi: 10.1239/aap/1363354103

Author

Glazebrook, Kevin ; Hodge, D. J. ; Kirkbride, Christopher. / Monotone policies and indexability for bi-directional restless bandits. In: Advances in Applied Probability. 2013 ; Vol. 45, No. 1. pp. 51-85.

Bibtex

@article{979d579596db4f6c919091a4f906a572,
title = "Monotone policies and indexability for bi-directional restless bandits",
abstract = "Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.",
keywords = "asset management, Gittins index, indexability, inventory management, Lagrangian relaxation, machine maintenance, monotone policy, stochastic dynamic programming, restless bandit , Whittle index",
author = "Kevin Glazebrook and Hodge, {D. J.} and Christopher Kirkbride",
year = "2013",
month = mar,
doi = "10.1239/aap/1363354103",
language = "English",
volume = "45",
pages = "51--85",
journal = "Advances in Applied Probability",
issn = "0001-8678",
publisher = "Cambridge University Press",
number = "1",

}

RIS

TY - JOUR

T1 - Monotone policies and indexability for bi-directional restless bandits

AU - Glazebrook, Kevin

AU - Hodge, D. J.

AU - Kirkbride, Christopher

PY - 2013/3

Y1 - 2013/3

N2 - Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.

AB - Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.

KW - asset management

KW - Gittins index

KW - indexability

KW - inventory management

KW - Lagrangian relaxation

KW - machine maintenance

KW - monotone policy

KW - stochastic dynamic programming

KW - restless bandit

KW - Whittle index

U2 - 10.1239/aap/1363354103

DO - 10.1239/aap/1363354103

M3 - Journal article

VL - 45

SP - 51

EP - 85

JO - Advances in Applied Probability

JF - Advances in Applied Probability

SN - 0001-8678

IS - 1

ER -