Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Monotone policies and indexability for bi-directional restless bandits
AU - Glazebrook, Kevin
AU - Hodge, D. J.
AU - Kirkbride, Christopher
PY - 2013/3
Y1 - 2013/3
N2 - Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.
AB - Motivated by a wide range of applications, we consider a development of Whittle's restless bandit model in which project activation requires a state-dependent amount of a key resource, which is assumed to be available at a constant rate. As many projects may be activated at each decision epoch as resource availability allows. We seek a policy for project activation within resource constraints which minimises an aggregate cost rate for the system. Project indices derived from a Lagrangian relaxation of the original problem exist provided the structural requirement of indexability is met. Verification of this property and derivation of the related indices is greatly simplified when the solution of the Lagrangian relaxation has a state monotone structure for each constituent project. We demonstrate that this is indeed the case for a wide range of bidirectional projects in which the project state tends to move in a different direction when it is activated from that in which it moves when passive. This is natural in many application domains in which activation of a project ameliorates its condition, which otherwise tends to deteriorate or deplete. In some cases the state monotonicity required is related to the structure of state transitions, while in others it is also related to the nature of costs. Two numerical studies demonstrate the value of the ideas for the construction of policies for dynamic resource allocation, most especially in contexts which involve a large number of projects.
KW - asset management
KW - Gittins index
KW - indexability
KW - inventory management
KW - Lagrangian relaxation
KW - machine maintenance
KW - monotone policy
KW - stochastic dynamic programming
KW - restless bandit
KW - Whittle index
U2 - 10.1239/aap/1363354103
DO - 10.1239/aap/1363354103
M3 - Journal article
VL - 45
SP - 51
EP - 85
JO - Advances in Applied Probability
JF - Advances in Applied Probability
SN - 0001-8678
IS - 1
ER -