Logarithmic regret for learning linear quadratic regulators efficiently

Asaf Cassel, Alon Cohen, Tomer Koren

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly, regret that scales only (poly)logarithmically with the number of steps in two scenarios: when only the state transition matrix A is unknown, and when only the stateaction transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound that shows that when the latter condition is violated, square root regret is unavoidable.

Original languageEnglish
Title of host publication37th International Conference on Machine Learning, ICML 2020
EditorsHal Daume, Aarti Singh
Pages1305-1314
Number of pages10
ISBN (Electronic)9781713821120
StatePublished - 2020
Event37th International Conference on Machine Learning, ICML 2020 - Virtual, Online
Duration: 13 Jul 202018 Jul 2020

Publication series

Name37th International Conference on Machine Learning, ICML 2020
VolumePartF168147-2

Conference

Conference37th International Conference on Machine Learning, ICML 2020
CityVirtual, Online
Period13/07/2018/07/20

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'Logarithmic regret for learning linear quadratic regulators efficiently'. Together they form a unique fingerprint.

Cite this