Need help with this week’s assignment? Get detailed and trusted solutions for Deep Learning – IIT Ropar Week 7 NPTEL Assignment Answers. Our expert-curated answers help you solve your assignments faster while deepening your conceptual clarity.
✅ Subject: Deep Learning – IIT Ropar
📅 Week: 7
🎯 Session: NPTEL 2025 July-October
🔗 Course Link: Click Here
🔍 Reliability: Verified and expert-reviewed answers
📌 Trusted By: 5000+ Students
For complete and in-depth solutions to all weekly assignments, check out 👉 NPTEL Deep Learning – IIT Ropar Week 7 NPTEL Assignment Answers
🚀 Stay ahead in your NPTEL journey with fresh, updated solutions every week!
NPTEL Deep Learning – IIT Ropar Week 7 Assignment Answers 2025
1. A weather prediction team fits two models for rainfall prediction:

They observe that the Model A's predictions are consistently far from the actual values and Model B's predictions vary significantly based on the training dataset.
Which of the following best describes the characteristics of Model A and Model B?
- Model A has high bias and low variance
- Model A has low bias and high variance
- Model B has high bias and low variance
- Model B has low bias and high variance
Answer : See Answers
2. Given E[(y−f^(x)2]=12.4,Bias2=4.0, and Variance=6.1
What is the value of the irreducible error (σ2)?
- 2.3
- 1.5
- 3.2
- 1.8
Answer :
3. In a facial recognition task, the average output of the model across datasets is far from the actual identity. However, model predictions across different training samples are very consistent. Which of the following are true?
- The model has high variance
- The model has high bias
- The model is underfitting
- The model is overfitting
Answer :
4. In a graph of training error and test error vs. model complexity, the test error first decreases and then starts increasing as shown in the below image.
.png)
Which phenomenon best explains this behavior?
- Increase in irreducible error
- Transition from high bias to high variance
- Gradient vanishing
- Increase in training dataset size
Answer :
5. You run a model on 4 different training sets and get the predictions for an input as:
[40, 60, 50, 50]. The true value is 55.
Which of the following is the best estimate of Bias2 and Variance?
- Bias2 = 0, Variance = 50
- Bias2 = 25, Variance = 50
- Bias2 = 25, Variance = 62.5
- Bias2 = 0, Variance = 62.5
Answer :
6. A robot collects temperature data using sensors. Each recorded value yi is modeled as yi=f(xi)+ϵi, where ϵi∼N(0,σ2). You are using predicted values y^i=f^(xi) to estimate the model’s accuracy. The true error is the sum of empirical test error and small constant.
If ŷ1=25.5,y1=27,ŷ2=26.0,y2=26 and ŷ3=24.5,y3=25. What is the empirical estimate of mean squared error?
- 1.25
- 0.75
- 0.833
- 0.5
Answer :
7. In a medical imaging project, you train a deep CNN to detect lung disease. On the training set of 5000 images, error is 3%. On a validation set of 500 unseen images, error rises to 14%. Why is the training error misleading in this case?
- The test error should always be lower than training error
- The training set includes noise that affects generalization
- The model may have memorized the training examples
- The training error does not reflect the true performance on unseen data
Answer : See Answers
8. A real estate prediction model trains with and without regularization. You observe that:
Without L2 regularization: weights = [20.5, -15.2, 8.3]
With L2 regularization: weights = [8.1, -5.2, 3.0]
What is the primary effect L2 regularization has had on the model?
- It removed irrelevant features
- It shrunk the weights toward zero
- It introduced sparsity
- It increased the learning rate
Answer :
9. During the training of a linear model, the gradient of the loss without regularization is:

What additional term is added to the gradient due to L2 regularization?
- αxi
- α|w|
- αw
- −αw
Answer :
10. Two regression models are trained on the same dataset:
Model A: Without regularization, Train Error = 2%, Test Error = 14%
Model B: With L2 regularization, Train Error = 5%, Test Error = 7%
Which of the following are true based on this result?
- Model A is overfitting
- Model B generalizes better
- Model B is underfitting
- L2 regularization improved test performance
Answer :
11. You want to control the complexity of your deep regression model. You’re experimenting with different values of α in L2 regularization.
What happens if you increase α too much?
- Model will overfit training data
- Model complexity will increase
- Model may underfit the data
- Weights will remain unchanged
Answer :
12. Consider a linear model trained using Ridge Regression with α=2.
If the original weight update rule is:
w∶=w−η⋅𝛻J
and ∇J=3, current weight = 5, learning rate η=0.1
What will be the new weight value after one update with L2 regularization?
- 4.67
- 4.40
- 3.30
- 3.70
Answer :
13. You are training a neural network for disease classification using X-ray images. After each epoch, you log both training and validation loss. The logs show:
.png)
Based on this information, which epoch is the best candidate for early stopping?
- Epoch 5
- Epoch 10
- Epoch 15
- Epoch 20
Answer :
14. In a model training scenario, you observe that:
Training error keeps decreasing
Validation error decreases initially but then increases
What does this behavior indicate?
- Model is underfitting
- Model is overfitting after some point
- Validation loss helps detect overfitting
- Training should continue as long as training loss decreases
Answer :
15. You use early stopping with patience = 3. That is, training will stop if validation loss doesn’t improve for 3 consecutive epochs.
The validation loss values across epochs are: [0.48, 0.42, 0.39, 0.39, 0.41, 0.44, 0.46]
At which epoch will training stop?
- 4
- 5
- 6
- 7
Answer :
16. Two models are trained for handwritten digit classification:
Model A: Trained for 50 epochs without early stopping
Model B: Trained with early stopping and stopped at epoch 20
Test set accuracy of Model A is 91% and Model B is 94%.
Which conclusion is most valid?
- Longer training always improves performance
- Early stopping led to underfitting
- Model B avoided overfitting by stopping early
- Model A had better generalization
Answer : See Answers
17. You examine a plot of training and validation errors vs. epochs.
Training error steadily decreases
Validation error decreases till epoch 12, then increases
What can you infer?
- Sweet spot is around epoch 12
- The model is perfectly trained
- Early stopping should trigger at or before epoch 12
- Training error always reflects generalization
Answer :
18. A medical startup is building a deep learning model to detect lung infections from chest X-ray images. The dataset contains only 500 labeled X-rays. To improve model generalization, the team applies image augmentation such as rotations(±𝟏𝟓o), horizontal flipping, and slight scaling. What are the benefits of applying such augmentations?
- It increases the number of unique training samples
- It helps reduce overfitting
- It introduces label noise
- It simulates real-world variations
Answer :
19. You train a CNN on a small dataset of handwritten digits (only 300 images). You notice:
Train accuracy: 99%
Validation accuracy: 72%
To improve generalization, you try dataset augmentation by applying random cropping and rotations.
What is the expected effect of dataset augmentation in this case?
- Increase in validation accuracy
- Further increase in training accuracy
- Increase in model size
- Decrease in dataset quality
Answer :
20. A team originally has 1,000 labeled images. They apply:
- Horizontal flip
- Random rotation (once per image)
- Random zoom (once per image)
Each augmentation is applied independently and once per image.
How many effective training samples will they have after augmentation?
- 1,000
- 2,000
- 3,000
- 4,000
Answer :
21. You are asked to compare data augmentation and L2 regularization in terms of their impact on model training. Which of the following are true?
- Both reduce overfitting
- Augmentation increases effective data diversity
- L2 regularization reduces dataset size
- Augmentation improves robustness to input variations
Answer :
22. A team uses random vertical flipping and 90o rotation for digit recognition (0–9) from MNIST dataset. Their validation accuracy drops.
What could be the main reason for the drop?
- Vertical flip and rotation introduce class confusion
- Training data is now too large
- The model has too many parameters
- Regularization strength is too low
Answer :
23. A company uses a convolutional neural network (CNN) for land cover classification in satellite images. Each image is of size 256×256. They compare:
Model A: Fully connected network
Model B: CNN with shared filters across image regions
Why is Model B preferred in this case?
- CNN reduces the number of parameters using parameter sharing
- CNN performs well with limited data due to shared weights
- CNN assigns a unique weight to each pixel
- CNN avoids the need for regularization
Answer : See Answers
24. You train a model on the MNIST dataset. To improve generalization, you add Gaussian noise to the input images during training.
What are the likely benefits of this noise injection?
- The model memorizes training data better
- The model learns more robust features
- It acts as an implicit regularizer
- It forces the model to rely less on exact input patterns
Answer :
25. A team builds three models to predict rainfall:
- Model A: Linear regression
- Model B: Random forest
- Model C: Neural network
They combine predictions using averaging.
What benefit does this ensemble provide?
- It guarantees perfect accuracy
- It reduces model variance
- It increases overfitting
- It avoids using any labeled data
Answer :
26. You apply bagging to train multiple models on resampled subsets of your crop dataset and average their predictions.
What are advantages of this approach?
- It increases the diversity of learned models
- It helps reduce both bias and variance
- It prevents underfitting
- It gives better results than a single model in most cases
Answer :
27. You use dropout in your deep learning model. During training, dropout randomly turns off neurons.
Which of the following statements best explains why dropout is related to ensemble methods?
- Dropout creates multiple paths, acting like training an ensemble of subnetworks
- Dropout reduces dataset size
- Dropout duplicates models
- Dropout disables backpropagation
Answer :
28. A fintech company is detecting fraud in credit card transactions. They build three models:
- Model A: Random Forest
- Model B: XGBoost
- Model C: Neural Network
They evaluate on the same dataset:
• Individual accuracy (A: 88%, B: 89%, C: 87%)
• Combined via weighted averaging, overall accuracy = 91%
What advantages does this ensemble approach offer?
- Reduces model variance
- Reduces overfitting due to diverse errors
- Ignores high-bias models
- Increases model diversity and robustness
Answer :
Data for the questions 29 and 30:
You are tasked with building a neural net to predict heart attack risk. You run models with varying complexity.
.png)
29. Which model is likely overfitting?
- Model A
- Model B
- Model C
- None
Answer :
30. What is the best course of action to improve Model C’s generalization?
- Use early stopping
- Apply L2 regularization
- Increase number of parameters
- Apply dropout in deeper layers
Answer : See Answers


