Abstract
Communication networks shared by many users are a widespread challenge nowadays. In this paper we address several aspects of this challenge simultaneously: learning unknown stochastic network characteristics, sharing resources with other users while keeping coordination overhead to a minimum. The proposed solution combines Multi-Armed Bandit learning with a lightweight signalling-based coordination scheme, and ensures convergence to a stable allocation of resources. Our work considers single-user level algorithms for two scenarios: an unknown fixed number of users, and a dynamic number of users. Analytic performance guarantees, proving convergence to stable marriage configurations, are presented for both setups. The algorithms are designed based on a system-wide perspective, rather than focusing on single user welfare. Thus, maximal resource utilization is ensured. An extensive experimental analysis covers convergence to a stable configuration as well as reward maximization. Experiments are carried out over a wide range of setups, demonstrating the advantages of our approach over existing state-of-the-art methods.
Original language | English |
---|---|
Article number | 8875003 |
Pages (from-to) | 2192-2207 |
Number of pages | 16 |
Journal | IEEE/ACM Transactions on Networking |
Volume | 27 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2019 |
Keywords
- Multi-armed bandits
- multi-user communications
All Science Journal Classification (ASJC) codes
- Software
- Computer Science Applications
- Computer Networks and Communications
- Electrical and Electronic Engineering