TY - JOUR

T1 - Personalized and Energy-Efficient Health Monitoring

T2 - A Reinforcement Learning Approach

AU - Eden, Batchen

AU - Bistritz, Ilai

AU - Bambos, Nicholas

AU - Ben-Gal, Irad

AU - Khmelnitsky, Evgeni

N1 - Publisher Copyright: © 2022 IEEE.

PY - 2023

Y1 - 2023

N2 - We consider a network of controlled sensors that monitor the unknown health state of a patient. We assume that the health state process is a Markov chain with a transition matrix that is unknown to the controller. At each timestep, the controller chooses a subset of sensors to activate, which incurs an energy (i.e., battery) cost. Activating more sensors improves the estimation of the unknown state, which introduces an energy-accuracy tradeoff. Our goal is to minimize the combined energy and state misclassification costs over time. Activating sensors now also provides measurements that can be used to learn the model, improving future decisions. Therefore, the learning aspect is intertwined with the energy-accuracy tradeoff. While Reinforcement Learning (RL) is often used when the model is unknown, it cannot be directly applied in health monitoring since the controller does not know the (health) state. Therefore, the monitoring problem is a partially observable Markov decision process (POMDP) where the cost feedback is also only partially available since the misclassification cost is unknown. To overcome this difficulty, we propose a monitoring algorithm that combines RL for POMDPs and online estimation of the expected misclassification cost based on a Hidden Markov Model (HMM). We show empirically that our algorithm achieves comparable performance with a monitoring system that assumes a known transition matrix and quantizes the belief state. It also outperforms the model-based approach where the estimated transition matrix is used for value iteration. Thus, our algorithm can be useful in designing energy-efficient and personalized health monitoring systems.

AB - We consider a network of controlled sensors that monitor the unknown health state of a patient. We assume that the health state process is a Markov chain with a transition matrix that is unknown to the controller. At each timestep, the controller chooses a subset of sensors to activate, which incurs an energy (i.e., battery) cost. Activating more sensors improves the estimation of the unknown state, which introduces an energy-accuracy tradeoff. Our goal is to minimize the combined energy and state misclassification costs over time. Activating sensors now also provides measurements that can be used to learn the model, improving future decisions. Therefore, the learning aspect is intertwined with the energy-accuracy tradeoff. While Reinforcement Learning (RL) is often used when the model is unknown, it cannot be directly applied in health monitoring since the controller does not know the (health) state. Therefore, the monitoring problem is a partially observable Markov decision process (POMDP) where the cost feedback is also only partially available since the misclassification cost is unknown. To overcome this difficulty, we propose a monitoring algorithm that combines RL for POMDPs and online estimation of the expected misclassification cost based on a Hidden Markov Model (HMM). We show empirically that our algorithm achieves comparable performance with a monitoring system that assumes a known transition matrix and quantizes the belief state. It also outperforms the model-based approach where the estimated transition matrix is used for value iteration. Thus, our algorithm can be useful in designing energy-efficient and personalized health monitoring systems.

KW - POMDP

KW - Reinforcement learning

KW - WBAN

KW - health monitoring

UR - http://www.scopus.com/inward/record.url?scp=85144813447&partnerID=8YFLogxK

U2 - https://doi.org/10.1109/LCSYS.2022.3229074

DO - https://doi.org/10.1109/LCSYS.2022.3229074

M3 - مقالة

SN - 2475-1456

VL - 7

SP - 955

EP - 960

JO - IEEE Control Systems Letters

JF - IEEE Control Systems Letters

ER -