Sign in

Prospect-theoretic Q-learning

By Vivek Borkar and Siddharth Chandak
We consider a prospect theoretic version of the classical Q-learning algorithm for discounted reward Markov decision processes, wherein the controller perceives a distorted and noisy future reward, modeled by a nonlinearity that accentuates gains and underrepresents losses relative to a reference point. We analyze the asymptotic behavior of the scheme... Show more
July 27, 2021
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Prospect-theoretic Q-learning
Click on play to start listening