Sign in

Robustness and risk management via distributional dynamic programming

By Mastane Achab and Gergely Neu
In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP). More generally in distributional reinforcement learning (DRL), the focus is on the whole distribution of the return,... Show more
December 28, 2021
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
Robustness and risk management via distributional dynamic programming
Click on play to start listening