Reinforcement Learning Week 4 NPTEL Assignment Answers 2025

Need help with this week’s assignment? Get detailed and trusted solutions for Reinforcement Learning Week 4 NPTEL Assignment Answers. Our expert-curated answers help you solve your assignments faster while deepening your conceptual clarity.

✅ Subject: Reinforcement Learning
📅 Week: 4
🎯 Session: NPTEL 2025 July-October
🔗 Course Link: Click Here
🔍 Reliability: Verified and expert-reviewed answers
📌 Trusted By: 5000+ Students

For complete and in-depth solutions to all weekly assignments, check out 👉 NPTEL Reinforcement Learning Week 4 NPTEL Assignment Answers

🚀 Stay ahead in your NPTEL journey with fresh, updated solutions every week!

NPTEL Reinforcement Learning Week 4 Assignment Answers 2025

1. State True/False
The state transition graph for any MDP is a directed acyclic graph.

True
False

Answer : See Answers

2. Consider the following statements:
(i) The optimal policy of an MDP is unique.
(ii) We can determine an optimal policy for a MDP using only the optimal value function(v∗
), without accessing the MDP parameters.
(iii) We can determine an optimal policy for a given MDP using only the optimal q-value function(q∗
), without accessing the MDP parameters.

Which of these statements are false?

Only (ii)
Only (iii)
Only (i), (ii)
Only (i), (iii)
Only (ii), (iii)

Answer :

3. Which of the following statements are true for a finite MDP? (Select all that apply).

The Bellman equation of a value function of a finite MDP defines a contraction in Banach space (using the max norm).
If 0≤γ<1, then the eigenvalues of γP_π are less than 1.
We call a normed vector space ’complete’ if Cauchy sequences exist in that vector space.
The sequence defined by vn=rπ+γP_πv_n−1 is a Cauchy sequence in Banach space (using the max norm). (P_π is a stochastic matrix)

Answer :

4. Which of the following is a benefit of using RL algorithms for solving MDPs?

They do not require the state of the agent for solving a MDP.
They do not require the action taken by the agent for solving a MDP.
They do not require the state transition probability matrix for solving a MDP.
They do not require the reward signal for solving a MDP.

Answer :

5. Consider the following equations:

Which of the above are correct?

Only (i)
Only (i), (ii)
Only (ii), (iii)
Only (i), (iii)
(i), (ii), (iii)

Answer :

6. What is true about the γ (discount factor) in reinforcement learning?

Discount factor can be any real number
The value of γ cannot affect the optimal policy
The lower the value of gamma, the more myopic the agent gets, i.e the agent maximises rewards that it receives over a shorter horizon

Answer : See Answers

7. Consider the following statements for a finite MDP (I is an identity matrix with dimensions |S|×|S|(S is the set of all states) and P_π is a stochastic matrix):
(i) MDP with stochastic rewards may not have a deterministic optimal policy.
(ii) There can be multiple optimal stochastic policies.
(iii) If 0≤γ<1, then rank of the matrix I−γP_π is equal to |S|.
(iv) If 0≤γ<1, then rank of the matrix I−γP_π is less than |S|.
Which of the above statements are true?

Only (ii), (iii)
Only (ii), (iv)
Only (i), (iii)
Only (i), (ii), (iii)

Answer :

8. Consider an MDP with 3 states A,B,C. At each state we can go to either of the two states. i.e if we are in state A then we can perform 2 actions, going to state B or C. The rewards for each transactions are r(A,B)=−3 (reward if we go from A to B), r(B,A)=−1, r(B,C)=8, r(C,B)=4, r(A,C)=0, r(C,A)=5, discount factor is 0.9. Find the fixed point of the value function for the policy π(A)=B (if we are in state A we choose the action to go to B) π(B)=C,π(C)=A.vπ([ABC])=? (round to 1 decimal place)

[20.6, 21.8, 17.6]
[30.4, 44.2, 32.4]
[30.4, 37.2, 32.4]
[21.6, 21.8, 17.6]

Answer :

9. Which of the following is not a valid norm function? (x is a D dimensional vector)

Answer :

10. For an operator L, which of the following properties must be satisfied by x for it to be a fixed point for L?(Multi-Correct)

Lx=x
L2x=x
∀λ>0Lx=λx
None of the above

Answer : See Answers

Reinforcement Learning Week 4 NPTEL Assignment Answers 2025

NPTEL Reinforcement Learning Week 4 Assignment Answers 2025

Important Links

Quick Links

NPTEL Reinforcement Learning Week 4 Assignment Answers 2025

Related Posts