Deep Learning – IIT Ropar Week 6 NPTEL Assignment Answers 2025

Need help with this week’s assignment? Get detailed and trusted solutions for Deep Learning – IIT Ropar Week 6 NPTEL Assignment Answers. Our expert-curated answers help you solve your assignments faster while deepening your conceptual clarity.

✅ Subject: Deep Learning – IIT Ropar
📅 Week: 6
🎯 Session: NPTEL 2025 July-October
🔗 Course Link: Click Here
🔍 Reliability: Verified and expert-reviewed answers
📌 Trusted By: 5000+ Students

For complete and in-depth solutions to all weekly assignments, check out 👉 NPTEL Deep Learning – IIT Ropar Week 6 NPTEL Assignment Answers

🚀 Stay ahead in your NPTEL journey with fresh, updated solutions every week!

NPTEL Deep Learning – IIT Ropar Week 6 Assignment Answers 2025

1. You’re working in a healthcare startup building a compression system for chest X-ray images using autoencoders. Your team has proposed the following 4 architectures for evaluation. Each diagram shows the number of neurons in the input layer, hidden layer, and output layer.

Your job is to choose the architecture(s) that will best compress the input while still allowing reasonable reconstruction. Based on the diagrams shown, identify which architecture(s) represent an undercomplete autoencoder.

  • A and C
  • B and D
  • A, C, and D
  • Only A
Answer : See Answers

Data for questions from 2 to 5

A smart factory deploys several sensors across machines to collect temperature, vibration, and humidity readings every second. A deep learning engineer proposes to use an autoencoder to compress and reconstruct this data to detect anomalies based on reconstruction error. The input vector 𝑥 from sensors is 100-dimensional. The engineer considers two different autoencoder architectures:

Model A: Hidden layer dimension = 20

Model B: Hidden layer dimension = 120

Each model is trained to minimize the reconstruction error between 𝑥 and its output 𝑥^.

2. Which of the following best describes the role of the hidden representation in this autoencoder setup?

  • It stores an exact copy of the input data.
  • It captures a lower or higher-dimensional encoding useful for reconstruction.
  • It applies label information to compress data.
  • It directly computes reconstruction loss.
Answer :

3. What can be said about Model A’s architecture in this scenario?

  • It is an overcomplete autoencoder.
  • It is an undercomplete autoencoder.
  • It risks learning an identity function.
  • It compresses the input data to extract meaningful features.
Answer :

4. What is the major risk of using Model B in this situation?

  • It might fail to train due to insufficient parameters.
  • It enforces sparsity and loses data fidelity.
  • It performs better due to higher capacity.
  • It could trivially copy input to output without learning meaningful representations.
Answer :

5. Suppose the autoencoder trained on normal sensor data starts showing high reconstruction error on new input data. What does this indicate?

  • The autoencoder has overfitted.
  • The sensors are working perfectly.
  • The new input is likely anomalous or different from training distribution.
  • The loss function needs to be changed.
Answer :

Data for the questions from 6 to 8

An engineer is designing an autoencoder to compress and reconstruct binary event logs generated by thousands of IoT sensors. Each log vector 𝑥 is a binary sequence of 64 values, indicating presence (1) or absence (0) of events in each time slot. The goal is to reconstruct the input accurately and use reconstruction error to detect communication failures or faulty sensors.

6. Which decoder activation function (𝑓) is the most suitable for the above input setting?

  • tanh
  • linear
  • sigmoid
  • ReLU
Answer : See Answers

7. Which loss function should the engineer use for training the above autoencoder?

Answer :

8. Why would using squared error loss with sigmoid decoder and binary input be suboptimal?

  • It cannot be used with sigmoid.
  • It gives poor gradient signals for binary classification tasks.
  • It assumes Gaussian distribution, which mismatches binary input assumptions.
  • It leads to unstable training with sigmoid output.

Data for the questions from 9 to 11

A data science team is building an autoencoder to model daily energy consumption patterns for households in a smart city. Each input vector 𝑥 is a 24-dimensional real-valued vector, where each element represents electricity usage (in kWh) during one hour of the day.

The objective is to reconstruct the real-valued signal accurately and detect unusual usage patterns via reconstruction error.

9. Which decoder activation function is most suitable for the given input scenario?

  • tanh
  • linear
  • sigmoid
  • ReLU
Answer :

10. Suppose the inputs are normalized but still real-valued. Which loss function best suits the model’s objective?

Answer :

11. Why is binary cross-entropy loss generally not appropriate when training an autoencoder on real-valued input data?

  • It requires outputs to be strictly positive.
  • It assumes binary targets in [0,1], which doesn’t match real-valued ground truth.
  • It leads to more accurate reconstructions for continuous data.
  • It is equivalent to squared loss when using sigmoid outputs.
Answer :

Data for the questions from 12 to 14

A geospatial analytics company handles large volumes of high-resolution satellite imagery data. Each image is represented as a 2048-dimensional real-valued vector after preprocessing. The team wants to reduce dimensionality for downstream clustering and anomaly detection tasks. Two researchers propose different methods:

Researcher A: Use PCA with 100 components.

Researcher B: Train a linear autoencoder with hidden dimension 100, using mean-centered inputs and squared error loss.

12. Under the above conditions, what can be said about the representations learned by both methods?

  • PCA and the linear autoencoder will learn identical subspaces
  • Autoencoder will learn more nonlinear features
  • PCA will fail due to lack of labels
  • Autoencoder needs more computation due to gradient descent
Answer :

13. If the dataset is found to lie on a nonlinear manifold, which approach would likely perform better for compression and why?

  • PCA, because it captures all the variance
  • Linear autoencoder, because it approximates PCA
  • Nonlinear autoencoder, because it can learn nonlinear mappings
  • None of the above
Answer :

14. The team now wants to visualize the compressed 2D representation of the data. Researcher A uses PCA with 2 components, while Researcher B trains a nonlinear autoencoder with a 2- dimensional hidden layer. Which of the following is true?

  • PCA will provide better results since it ensures maximum variance in 2D
  • The autoencoder may learn better separability due to nonlinearity
  • Both methods are equivalent for visualization
  • PCA will outperform because autoencoders need labels
Answer : See Answers

Data for questions from 15 to 16

A financial analytics company is using an autoencoder to compress user transaction histories represented as 512-dimensional real-valued vectors. The goal is to detect fraud based on abnormal reconstructions. To increase model capacity, they build a deep autoencoder with hidden layers wider than the input (e.g., 512 → 1024 → 512 → 1024 → 512).

After training, they notice that the model simply copies input to output, resulting in low reconstruction loss even for fraudulent patterns — defeating the purpose.

15. Why is this trivial copying behavior a problem in overcomplete autoencoders?

  • It leads to large training loss
  • It results in representations that don’t generalize
  • It prevents the decoder from learning the sigmoid function
  • It increases model interpretability
Answer :

16. Which of the following regularization strategies can be used to prevent trivial identity mappings in such overcomplete architectures?

  • Adding noise to inputs
  • Penalizing activations to enforce sparsity
  • Adding dropout to the output layer only
  • Penalizing the Jacobian of hidden activations
Answer :

17. A smart city infrastructure team collects air quality readings from thousands of distributed IoT sensors. However, due to packet loss and electromagnetic interference, some sensor readings are partially missing or corrupted. To build a robust model that can learn meaningful patterns despite this corruption, a machine learning engineer trains an autoencoder by randomly corrupting some of the input dimensions and then teaching the model to reconstruct the original uncorrupted input.

What kind of autoencoder is being used in this scenario?

  • Sparse Autoencoder
  • Denoising Autoencoder
  • Contractive Autoencoder
  • Overcomplete Autoencoder
Answer :

Data for questions 18 and 19:

A health-tech startup is building an AI-based image enhancement tool for remote clinics. The clinics upload grayscale X-ray images captured using portable scanners. However, due to hardware limitations and network noise, the received images often contain visual distortions and random noise. To address this, the ML team is exploring autoencoder-based solutions for image denoising.

The input images are of size 128 × 128 pixels, grayscale. Each image is flattened into a vector before feeding into the model.

Two different autoencoder architectures are proposed:

  1. A vanilla autoencoder trained on clean images.
  2. A denoising autoencoder (DAE) trained by adding Gaussian noise to the input images and then reconstructing the clean versions.

18. What is the dimension of the input vector xi given to the autoencoder?

Fill the blank ________

Answer :

19. Which of the following statements correctly compares the regular and denoising autoencoder approaches for this medical image task?

  • Vanilla autoencoder learns to reconstruct clean inputs, but may fail when noisy inputs are given at test time
  • Denoising autoencoder explicitly learns to remove noise and is better suited for this application
  • Both autoencoders will perform equally well if trained on clean data
  • Denoising autoencoders require label supervision, while vanilla autoencoders do not
Answer :

20. An engineer builds an autoencoder for real-valued inputs but uses a sigmoid decoder activation and binary cross-entropy loss. What issue might arise from this setup?

  • There will be a mismatch between output range and data distribution
  • The model will have high reconstruction accuracy
  • Binary cross-entropy is optimal for real-valued targets
  • There is no issue — this is standard practice
Answer : See Answers

21. Which of the following loss functions encourages sparsity in hidden unit activations?

Answer :

22. What is the main idea behind a Sparse Autoencoder?

  • It forces the hidden representation to be close to 1
  • It minimizes the Jacobian of the hidden layer
  • It restricts most hidden units to be inactive (close to 0)
  • It enforces that the encoder is always overcomplete
Answer :

23. Which activation pattern is most likely from a well-trained sparse autoencoder?

  • All hidden units have average activation ≈ 0.5
  • Most hidden units are frequently active across inputs
  • Hidden units always output 1 or 0
  • Only a small subset of hidden units fire for each input
Answer :

24. What is the key regularization objective of a Contractive Autoencoder?

  • Penalize the sensitivity of the encoder to small input changes
  • Encourage outputs close to 1
  • Match the hidden activations to a target probability
  • Apply noise to the input during training
Answer :

25. Match the Autoencoder variant with the application it is best suited for:

  • A–2, B–1, C–3, D–4
  • A–3, B–2, C–1, D–4
  • A–1, B–4, C–2, D–3
  • A–4, B–1, C–2, D–3
Answer :

26. In a Contractive Autoencoder, the added penalty on the Jacobian of hidden activations with respect to the input encourages which of the following behaviors?

  • Encoder output becomes highly sensitive to input changes
  • Decoder learns to reconstruct noisy images
  • Input features are transformed into binary values
  • Hidden representation becomes invariant to small input perturbations
Answer :

27. Match the following autoencoder setups to their respective types:

  • A–1, B–2, C–3, D–4
  • A–3, B–1, C–4, D–2
  • A–4, B–3, C–2, D–1
  • A–1, B–4, C–3, D–2
Answer :  See Answers