Robust and personalised online learning

School Of Mathematical Sciences

Associated organisational unit

Statistical Artificial Intelligence

Electronic data

2025ArabzadehPhD
Final published version, 1.52 MB, PDF document
Embargo ends: 1/06/27
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.17635/lancaster/thesis/2777
Final published version

View graph of relations

Research output: Thesis › Doctoral Thesis

Published

Ali Arabzadeh

More...

Publication date	2025
Number of pages	169
Qualification	PhD
Awarding Institution	Lancaster University
Supervisors/Advisors	Grant, James A., Supervisor Leslie, David, Supervisor
Thesis sponsors	UKRI EPSRC
Publisher	Lancaster University
<mark>Original language</mark>	English

Abstract

Over the past decade, multi-armed bandits have attracted significant attention from the online learning and machine learning communities, owing to their broad applicability in both theory and practice. From recommendation systems to adaptive control problems, the bandit framework effectively tackles the exploration-exploitation trade-off by modelling sequential decision making problems in a mathematically tractable manner. Meanwhile, the rapid growth of data through smartphones, edge devices, and networked sensors has created an urgent need for private and decentralised solutions. Federated learning meets this need by enabling collaborative model training without centralising raw user data, thus preserving user privacy and mitigating risks associated with data transfer.
In this thesis, we integrate advanced online learning solutions into a federated environment, focusing on the X-armed bandit problem, a generalisation of multi-armed bandit to continuous action spaces. We present two major lines of work: one addressing the personalisation challenge by adapting to heterogeneous user distributions, and another ensuring robustness when facing corrupted or adversarial clients. Our solution employs an optimistic, phase-based approach that enhances efficiency, supported by confidence bounds that guarantee reliable performance. Beyond the federated setting, we also introduce a \emph{corruption-robust} solution for the centralised version of X-armed bandits, providing theoretical guarantees on performance under adversarial perturbations. Rigorous theoretical analyses confirm the effectiveness of our methods and also offer insights into robust, privacy-aware sequential decision-making in distributed environments.

Research

Associated organisational unit

Electronic data

Text available via DOI: