Final published version, 1.89 MB, PDF document
Available under license: None
Research output: Thesis › Doctoral Thesis
Research output: Thesis › Doctoral Thesis
}
TY - BOOK
T1 - On the dynamic allocation of assets subject to failure and replenishment
AU - Ford, Stephen
PY - 2021
Y1 - 2021
N2 - Problems of the dynamic allocation of assets subject to both failure and replenishment are common. We consider a problem inspired by naval search, where unmanned aerial vehicles are required to search an area of ocean for targets. The vehicles will require refuelling or rearming; this is represented by the aspects of failure and replenishment. Similar models can arise from considering problems of search and rescue, environmental monitoring, or project management.We formulate several versions of the problem, initially using the framework of a Markov decision process, bearing in mind trade-offs between real-world fidelity and mathematical tractability. We first consider models where rewards are gained independently from different tasks, before moving on to consider a specific kind of dependence in the rewards. We use a variety of mathematical techniques, including restless bandits, to formulate near-optimal policies for a slew of models.We consider and investigate the various policies through comprehensive computational modelling. For the independent case, we find that a Whittle index policy is extremely close to optimal while being computationally efficient. For the dependent formulation, we create a class of policies guaranteed to contain the optimal, parameterise the space, then choose the best from a limited set of parameters, augmenting with a single step of policy improvement.We close with some thoughts about what we have learned, considerations about applying the results presented in this thesis, and a discussion of intensifications and extensions we did not have time to consider.
AB - Problems of the dynamic allocation of assets subject to both failure and replenishment are common. We consider a problem inspired by naval search, where unmanned aerial vehicles are required to search an area of ocean for targets. The vehicles will require refuelling or rearming; this is represented by the aspects of failure and replenishment. Similar models can arise from considering problems of search and rescue, environmental monitoring, or project management.We formulate several versions of the problem, initially using the framework of a Markov decision process, bearing in mind trade-offs between real-world fidelity and mathematical tractability. We first consider models where rewards are gained independently from different tasks, before moving on to consider a specific kind of dependence in the rewards. We use a variety of mathematical techniques, including restless bandits, to formulate near-optimal policies for a slew of models.We consider and investigate the various policies through comprehensive computational modelling. For the independent case, we find that a Whittle index policy is extremely close to optimal while being computationally efficient. For the dependent formulation, we create a class of policies guaranteed to contain the optimal, parameterise the space, then choose the best from a limited set of parameters, augmenting with a single step of policy improvement.We close with some thoughts about what we have learned, considerations about applying the results presented in this thesis, and a discussion of intensifications and extensions we did not have time to consider.
U2 - 10.17635/lancaster/thesis/1305
DO - 10.17635/lancaster/thesis/1305
M3 - Doctoral Thesis
PB - Lancaster University
ER -