Introduction to Machine Learning Week 12 NPTEL Assignment Answers 2025

NPTEL Introduction to Machine Learning Week 12 Assignment Answers 2024

1. Statement 1: Empirical error is always greater than generalisation error.
Statement 2: Training data and test data have different underlying(true) distributions.
Choose the correct option:

Statement 1 is true. Statement 2 is true. Statement 2 is the correct reason for statemnet 1.
Statement 1 is true. Statement 2 is true. Statement 2 is not the correct reason for statemnet 1.
Statement 1 is true. Statement 2 is false.
Both statements are false.

Answer :- b

3. Which of the following is/are the shortcomings of TD Learning that Q-learning resolves?

TD learning cannot provide values for (state, action) pairs, limiting the ability to extract an optimal policy directly
TD learning requires knowledge of the reward and transition functions, which is not always available
TD learning is computationally expensive and slow compared to Q-learning
TD learning often suffers from high variance in value estimation, leading to unstable learning
TD learning cannot handle environments with continuous state and action spaces effectively

Answer :- a, d

[Week 1-12] NPTEL Introduction to Machine Learning Assignment Answers 2025

By Answer GPT In 2025 January to April

Buy Now

[Week 1-12] NPTEL Introduction to Machine Learning Assignment Answers 2024

By Answer GPT In 2024 January to April, Machine Learning

Buy Now

5. The VC dimension of a pair of squares is:

Answer :- a

6. What is V(X4) after one application of the given formula?

1
0.9
0.81
0

Answer :- b

7. What is V(X1) after one application of given formula?

-1
-0.9
-0.81
0

Answer :- d

8. What is V(X1) after V converges?

0.54
-0.9
0.63
0

Answer :- d

10. In games like Chess or Ludo, the transition function is known to us. But what about Counter Strike or Mortal Combat or Super Mario? In games where we do not know T, we can only query the game simulator with current state and action, and it returns the next state. This means we cannot directly argmax or argmin for V(T(S,a)). Therefore, learning the value function V is not sufficient to construct a policy. Which of these could we do to overcome this? (more than 1 may apply)

Assume there exists a method to do each option. You have to judge whether doing it solves the stated problem.

Directly learn the policy
Learn a different function which stores value for state-action pairs (instead of only state like V does)
Learn T along with V
Run a random agent repeatedly till it wins. Use this as the winning policy

Answer :- a, b

Introduction to Machine Learning Week 12 NPTEL Assignment Answers 2025

NPTEL Introduction to Machine Learning Week 12 Assignment Answers 2024

Quick Links

Get In Touch

NPTEL Introduction to Machine Learning Week 12 Assignment Answers 2024

[Week 1-12] NPTEL Introduction to Machine Learning Assignment Answers 2025

[Week 1-12] NPTEL Introduction to Machine Learning Assignment Answers 2024

Related Posts