TY - GEN
T1 - Admission Control for Games with a Dynamic Set of Players
AU - Bistritz, Ilai
AU - Bambos, Nicholas
N1 - Publisher Copyright: © 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - We consider open games where players arrive according to a Poisson process with rate λ and stay in the game for an exponential random duration with rate μ. The game evolves in continuous time where each player n sets an exponential random clock and updates her action an ∈ {0,⋯, K} when it expires. The players take independent best-response actions that, uninterrupted, can converge to a Nash Equilibrium (NE). This models open multiagent systems such as wireless networks, cloud computing, and online marketplaces. When λ is small, the game spends most of the time in a (time-varying) equilibrium. This equilibrium exhibits predictable behavior and can have performance guarantees by design. However, when λ is too small, the system is under-utilized since not many players are in the game on average. Choosing the maximal λ that the game can support while still spending a target fraction 0 < ρ < 1 of the time at equilibrium requires knowing the reward functions. To overcome that, we propose an online learning algorithm that the gamekeeper uses to adjust the probability θ to admit an incoming player. The gamekeeper only observes whether an action was changed, without observing the action or who played it. We prove that our algorithm learns, with probability 1, a θ∗ such that the game is at equilibrium for at least ρ fraction of the time, and no more than ρ+ϵ(μ,ρ) < 1, where we provide an analytic expression for ϵ(μ,ρ). Our algorithm is a black-box method to transfer performance guarantees of distributed protocols from closed systems to open systems.
AB - We consider open games where players arrive according to a Poisson process with rate λ and stay in the game for an exponential random duration with rate μ. The game evolves in continuous time where each player n sets an exponential random clock and updates her action an ∈ {0,⋯, K} when it expires. The players take independent best-response actions that, uninterrupted, can converge to a Nash Equilibrium (NE). This models open multiagent systems such as wireless networks, cloud computing, and online marketplaces. When λ is small, the game spends most of the time in a (time-varying) equilibrium. This equilibrium exhibits predictable behavior and can have performance guarantees by design. However, when λ is too small, the system is under-utilized since not many players are in the game on average. Choosing the maximal λ that the game can support while still spending a target fraction 0 < ρ < 1 of the time at equilibrium requires knowing the reward functions. To overcome that, we propose an online learning algorithm that the gamekeeper uses to adjust the probability θ to admit an incoming player. The gamekeeper only observes whether an action was changed, without observing the action or who played it. We prove that our algorithm learns, with probability 1, a θ∗ such that the game is at equilibrium for at least ρ fraction of the time, and no more than ρ+ϵ(μ,ρ) < 1, where we provide an analytic expression for ϵ(μ,ρ). Our algorithm is a black-box method to transfer performance guarantees of distributed protocols from closed systems to open systems.
UR - http://www.scopus.com/inward/record.url?scp=85184813126&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/CDC49753.2023.10383907
DO - https://doi.org/10.1109/CDC49753.2023.10383907
M3 - منشور من مؤتمر
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 1219
EP - 1224
BT - 2023 62nd IEEE Conference on Decision and Control, CDC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 62nd IEEE Conference on Decision and Control, CDC 2023
Y2 - 13 December 2023 through 15 December 2023
ER -