Master's Degree in Economics and Finance

Master's Degree in Specialized Economic Analysis

Master's Degree in Data Science

Admissions

Career Services

Alumni Community

Economics Program

Financial Economics Program

PhD Economics Track Program

Go to Master's degrees

Close

Publications

Multiplayer Bandits Without Observing Collision Information

Open Access

Authors: Gábor Lugosi.
Mathematics of Operations Research, Vol. 47 , No 2 ,1247–1265, July, 2022.

Open Access

We study multiplayer stochastic multiarmed bandit problems in which the players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward. We consider two feedback models: a model in which the players can observe whether a collision has occurred and a more difficult setup in which no collision information is available. We give the first theoretical guarantees for the second model: an algorithm with a logarithmic regret and an algorithm with a square-root regret that does not depend on the gaps between the means. For the first model, we give the first square-root regret bounds that do not depend on the gaps. Building on these ideas, we also give an algorithm for reaching approximate Nash equilibria quickly in stochastic anticoordination games.