Home > Research > Publications & Outputs > A Two-Timescale Learning Automata Solution to t...

Electronic data

  • Polling_LA_IEEE_Papadimitriou

    Accepted author manuscript, 813 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

A Two-Timescale Learning Automata Solution to the Nonlinear Stochastic Proportional Polling Problem

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print
Close
<mark>Journal publication date</mark>17/09/2024
<mark>Journal</mark>IEEE Transactions on Systems, Man, and Cybernetics: Systems
Number of pages12
Publication StatusE-pub ahead of print
Early online date17/09/24
<mark>Original language</mark>English

Abstract

In this article, we introduce a novel learning automata (LA) solution to the nonlinear stochastic proportional polling (NSPP) problem. The only available solution to this problem in the literature is that given by Nicopolitidis et al. (2003), Obaidat et al. (2002), and Papadimitriou et al. (2002). It was shown to solve a large set of the adaptive resource allocation problems under noisy environments (Nicopolitidis et al., 2003; Obaidat et al., 2002; Papadimitriou and Pomportsis, 2000 and 1999; Nicopolitidis et al., 2004; Obaidat et al., 2001; and Papadimitriou and Pomportsis, 2000). We make a threefold contribution. First, we take a two-timescale approach to the field of LA by estimating the reward probabilities on a faster timescale than the timescale for updating the polling probabilities. Second, by making a not-obvious choice of the objective function, we show that the NSPP problem is indeed an instantiation of the stochastic nonlinear fractional equality knapsack (NFEK) problem, which is a substantial resource allocation problem based on the incomplete and noisy information (Granmo and Oommen, 2010). Third, in contrast to the legacy approach taken by Papadimitriou and Maritsas (1992 and 1996), we show through the extensive experimental results that our solution is remarkably robust to the choice of tuning parameters and that it outperforms the state of the art solution in terms of the Bayesian expected loss.