Sign in

Equilibrium Bandits: Learning Optimal Equilibria of Unknown Dynamics

By Siddharth Chandak and others
Consider a decision-maker that can pick one out of K actions to control an unknown system, for T turns. The actions are interpreted as different configurations or policies. Holding the same action fixed, the system asymptotically converges to a unique equilibrium, as a function of this action. The dynamics of... Show more
February 27, 2023
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Equilibrium Bandits: Learning Optimal Equilibria of Unknown Dynamics
Click on play to start listening