Heuristic and reinforcement learning methods for admission and routing control in a queueing system with multiple heterogeneous facilities

Management Science

Associated organisational units

Research output: Contribution to conference - Without ISBN/ISSN › Abstract

Published

Robert Shone
V.A. Knight
P.R. Harper

Publication date	14/07/2021
<mark>Original language</mark>	English
Event	31st European Conference on Operational Research - Athens, Greece Duration: 11/07/2021 → 14/07/2021 Conference number: 31

Conference

Conference	31st European Conference on Operational Research
Country/Territory	Greece
City	Athens
Period	11/07/21 → 14/07/21

Abstract

Methods for optimising measures of social utility in queueing systems usually require interventions to be made in order to prevent customers from pursuing their own narrow self-interest. In this talk we consider a multiple-facility queueing system with both admission and routing control. The facilities are heterogeneous, with different server capacities, service rates and cost-reward structures, and we define a “socially optimal” policy as one that maximises the long-run average net reward per unit time. Although such policies can theoretically be found using stochastic dynamic programming, examples can be found which show that they may possess surprising and counter-intuitive properties, and thus there are very few general structural properties that can be proved.

It is possible, however, to show that the set of positive recurrent states under a socially optimal policy is contained within the corresponding set of states under a “selfish” policy. This has useful implications for computational algorithms, although we still encounter the “curse of dimensionality” in problems with a large number of facilities. For larger problems, we discuss the use of various heuristic methods, including those derived from optimal solutions to multi-armed bandit problems. We also consider reinforcement learning methods, in which progressive value function estimates are obtained by simulating the random evolution of the system.

Research

Associated organisational units