Learning Algorithm Using Replicator Mutator-Dynamics in Two-Player Zero-Sum Games

Mitsuki Sakamoto

5:40 PM - 6:00 PM

[2O6-GS-5-02] Learning Algorithm Using Replicator Mutator-Dynamics in Two-Player Zero-Sum Games

Mitsuki Sakamoto¹, 〇Kentaro Toyoshima¹, Kenshi Abe², Atsushi Iwasaki¹ (1. The University of Electro-Communications, 2. CyberAgent, Inc.)

[[Online]]

Keywords:Agent, Machine Learning

In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games.
FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while many variants suffer from the issue of limit cycling behavior, i.e., lacks the last-iterate convergence guarantee.
To resolve this issue, we propose a mutation-driven FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities.
We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points which approximate Nash equilibria under full-information feedback.
Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2O6-GS-5] Agents: game thoery

[2O6-GS-5-02] Learning Algorithm Using Replicator Mutator-Dynamics in Two-Player Zero-Sum Games

Password