Mdps Reinforcement Learning Prove Always West Policy States S West Better Always East Poli Q37196487

December 7, 2021 by Administrator

MDPs and Reinforcement Learning

Prove that the always-west policy [for all states s, $π(s)$ = West] isbetter than the always-east policy: [for all states s, $π(s)$ = East].

Hint: you can prove it by showing that for each state, itsrewards under always-west is higher than its reward underalways-east .

π(s) π(s) Show transcribed image text π(s)
π(s)

Answer