Sign in

A Concentration Bound for TD(0) with Function Approximation

By Siddharth Chandak and Vivek Borkar at
LogoStanford University
and
LogoIndian Institute of Technology Bombay
We derive a concentration bound of the type `for all n \geq n_0 for some n_0' for TD(0) with linear function approximation. We work with online TD learning with samples from a single sample path of the underlying Markov chain. This makes our analysis significantly different from offline TD learning... Show more
December 16, 2023
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
A Concentration Bound for TD(0) with Function Approximation
Click on play to start listening