Quantization Enabled Differential Privacy in Bandit Games With Cooperative Players

Yeming Lin, Kun Liu, Ilai Bistritz, Qian Ma, Yuanqing Xia

Research output: Contribution to journalArticlepeer-review

Abstract

This paper addresses the bandit game problem subject to privacy leakage, where the cooperative players aim to learn the optimal action profile that minimizes the global cost. The players do not have closed-form expressions for their payoff functions and can only receive the feedback of their local costs. We propose a privacy-preserving distributed bandit learning algorithm based on the residual gradient estimator, which adopts the stochastic quantization with a binary randomized response scheme to mask action profile estimates before communication. The theoretical analysis demonstrates that our algorithm can achieve an expected regret order of O(T3/4) and preserve εdp-differential privacy for the players.

Original languageEnglish
JournalIEEE Transactions on Automatic Control
DOIs
StateAccepted/In press - 2025

Keywords

  • Bandit games
  • cooperative optimization
  • privacy preservation
  • stochastic quantization

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Quantization Enabled Differential Privacy in Bandit Games With Cooperative Players'. Together they form a unique fingerprint.

Cite this