Abstract
In this paper we address the problem of decision making within a Markov decision process (MDP) framework where risk and modeling errors are taken into account. Our approach is to minimize a risk-sensitive conditional-value-at-risk (CVaR) objective, as opposed to a standard risk-neutral expectation. We refer to such problem as CVaR MDP. Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget. This result, which is of independent interest, motivates CVaR MDPs as a unifying framework for risk-sensitive and robust decision making. Our second contribution is to present an approximate value-iteration algorithm for CVaR MDPs and analyze its convergence rate. To our knowledge, this is the first solution algorithm for CVaR MDPs that enjoys error guarantees. Finally, we present results from numerical experiments that corroborate our theoretical findings and show the practicality of our approach.
| Original language | English |
|---|---|
| Pages (from-to) | 1522-1530 |
| Number of pages | 9 |
| Journal | Advances in Neural Information Processing Systems |
| Volume | 2015-January |
| State | Published - 2015 |
| Event | 29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada Duration: 7 Dec 2015 → 12 Dec 2015 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Signal Processing