Home > Research > Publications & Outputs > Deep Q-Networks for Aerial Data Collection in M...


Text available via DOI:

View graph of relations

Deep Q-Networks for Aerial Data Collection in Multi-UAV-Assisted Wireless Sensor Networks

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

  • Yousef Emami
  • Bo Wei
  • Kai Li
  • Wei Ni
  • Eduardo Tovar
Publication date9/08/2021
Host publication2021 International Wireless Communications and Mobile Computing (IWCMC)
ISBN (electronic)9781728816160
ISBN (print)9781728186177
<mark>Original language</mark>English

Publication series

NameInternational Wireless Communications and Mobile Computing (IWCMC)
ISSN (Print)2376-6492
ISSN (electronic)2376-6506


Unmanned Aerial Vehicles (UAVs) can collaborate to collect and relay data for ground sensors in remote and hostile areas. In multi-UAV-assisted wireless sensor networks (MA-WSN), the UAVs' movements impact on channel condition and can fail data transmission, this situation along with newly arrived data give rise to buffer overflows at the ground sensors. Thus, scheduling data transmission is of utmost importance in MA-WSN to reduce data packet losses resulting from buffer overflows and channel fading. In this paper, we investigate the optimal ground sensor selection at the UAVs to minimize data packet losses. The optimization problem is formulated as a multi-agent Markov decision process, where network states consist of battery levels and data buffer lengths of the ground sensor, channel conditions, and waypoints of the UAV along the trajectory. In practice, an MA-WSN contains a large number of network states, while the up-to-date knowledge of the network states and other UAVs' sensor selection decisions is not available at each agent. We propose a Multi-UAV Deep Reinforcement Learning based Scheduling Algorithm (MUAIS) to minimize the data packet loss, where the UAVs learn the underlying patterns of the data and energy arrivals at all the ground sensors. Numerical results show that the proposed MUAIS achieves at least 46 % and 35% lower packet loss than an optimal solution with single-UAV and an existing non-learning greedy algorithm, respectively.