Sign in

Concentration of Contractive Stochastic Approximation and Reinforcement Learning

By Siddharth Chandak and others
Using a martingale concentration inequality, concentration bounds `from time n_0 on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).
October 26, 2021
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Concentration of Contractive Stochastic Approximation and Reinforcement Learning
Click on play to start listening