Problem 4. Consider the MDP represented by a graph in Figure 1 with discount factor γ E [0, 1). States are represented by circles. The pair on the arrows shows the action to be taken and the transition probability from a state to another state, respectively (please note that the representation of MDP as a graph here
OR
OR