NPTEL Introduction to Machine Learning Week 10 Assignment Answers 2024
1. K-means algorithm is not a particularly sophisticated approach for Image Segmentation tasks. Choose the best possible explanation from below which supports the claim:
- It takes no account of the spatial proximity of different pixels.
- The curse of dimensionality does not affect the performance of K-means algorithm, as it effectively handles high-dimensional data with minimal loss of accuracy.
- The algorithm requires the number of clusters (K) to be specified beforehand.
- Initialization does not affect K
- means.
Answer :- a, c
2. The pairwise distance between 6 points is given below. Which of the option shows the hierarchy of clusters created by single link clustering algorithm?





Answer :- b
3. For the pairwise distance matrix given in the previous question, which of the following shows the hierarchy of clusters created by the complete link clustering algorithm.




Answer :- b
5. Statement 1: CURE is robust to outliers.
Statement 2: Because of multiplicative shrinkage, the effect of outliers is dampened.
- Statement 1 is true. Statement 2 is true. Statement 2 is the correct reason for statemnet 1.
- Statement 1 is true. Statement 2 is true. Statement 2 is not the correct reason for statemnet 1.
- Statement 1 is true. Statement 2 is false.
- Both statements are false.
Answer :- a
6. Which of the following statements about the Rand Index is true?
- It is insensitive to the permutations of cluster labels
- It is biased towards larger clusters
- It cannot handle overlapping clusters
- It is unaffected by outliers in the data
Answer :- a
8. Run BIRCH on the input features of iris dataset using Birch(n_clusters=5, threshold=2). What is the rand-index obtained?
- 0.68
- 0.71
- 0.88
- 0.98
Answer :- b
9. Run PCA on Iris dataset input features with n components = 2. Now run DBSCAN using DBSCAN(eps=0.5, min samples=5) on both the original features and the PCA features. What are their respective number of outliers/noisy points detected by DBSCAN?
As an extra, you can plot the PCA features on a 2D plot using matplotlib.pyplot.scatter with parameter c = y pred (where y pred is the cluster prediction) to visualise the clusters and outliers.
- 10, 10
- 17, 7
- 21, 11
- 5, 10
Answer :- b