Abstract
We consider the problem of dynamic spectrum access (DSA) in cognitive wireless networks, consisting of primary users (PUs) and secondary users (SUs), where only partial observations are available at the SUs due to narrowband sensing and transmissions. The network operates in a time-slotted regime, where the traffic patterns of the PUs are modeled as finite-memory Markov chains, that are unknown to the SUs. Since observations are partial, then both channel sensing and access actions affect the throughput. Focusing on the case in which there is a single SU, our objective is to maximize the SU's long-term throughput. To that aim, we develop a novel algorithm that learns both access and sensing policies via deep Q-learning, dubbed Double Deep Q-network for Sensing and Access (DDQSA). To the best of our knowledge, this is the first work that jointly optimizes both sensing and access policies for DSA via deep Q-learning. Next, we consider wireless networks with access policy which implements a fixed channel hopping dynamics, for which we analytically determine the optimal SU sensing and access policy and its associated throughput. Then, we demonstrate that indeed, the proposed DDQSA algorithm can achieve near-optimal performance for the considered network. Our results show that the proposed DDQSA algorithm learns a policy that implements both sensing and channel access, which significantly outperforms existing approaches, and can achieve the optimal performance in certain scenarios.
Original language | American English |
---|---|
Pages (from-to) | 4930-4946 |
Number of pages | 17 |
Journal | IEEE Transactions on Wireless Communications |
Volume | 22 |
Issue number | 7 |
DOIs | |
State | Published - 1 Jul 2023 |
Keywords
- Cognitive radio networks
- deep reinforcement learning
- dynamic spectrum access
- wireless channels
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Electrical and Electronic Engineering
- Applied Mathematics