Mdps Reinforcement Learning Prove Always West Policy States S West Better Always East Poli Q37196487

MDPs and Reinforcement Learning

Prove that the always-west policy [for all states s,π(s) = West] isbetter than the always-east policy: [for all states s,π(s) = East].

Hint: you can prove it by showing that for each state, itsrewards under always-west is higher than its reward underalways-east .

π(s) π(s) Show transcribed image text π(s)
π(s)


Answer


Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.