Data Analytics with Python Week 10 NPTEL Assignment Answers 2025

NPTEL Data Analytics with Python Week 10 Assignment Answers 2024

1. Sampling distribution for a goodness of fit test is the

Options:
a. Poisson distribution
b. t distribution
c. normal distribution
d. chi-square distribution

✅ Answer: d
Explanation: Chi-square distribution is used because it compares observed and expected frequencies, making it suitable for categorical data in a goodness-of-fit test.


2. Goodness of fit test is always conducted as a

Options:
a. lower-tail test
b. upper-tail test
c. middle test
d. None of these alternatives is correct

✅ Answer: b
Explanation: It is an upper-tail test since we are interested in whether the test statistic is large enough to reject the null hypothesis, indicating significant deviation.


3. State True or False: Statement: Null hypothesis for chi square test of independence assumes that, all the proportions are equal.

Options:
a. True
b. False

✅ Answer: a
Explanation: The chi-square test of independence assumes no association between variables, implying proportions are equal across categories.


4. Statistical test conducted to determine whether to reject or not reject a hypothesized probability distribution for a population is known as a

Options:
a. contingency test
b. probability test
c. goodness of fit test
d. None of these alternatives is correct

✅ Answer: c
Explanation: Goodness-of-fit test is used to assess whether the observed data fits a specified distribution.


5. What is the minimum no. of variables/features required to perform clustering?

Options:
a. 0
b. 1
c. 2
d. 3

✅ Answer: b
Explanation: Even with one feature, clustering can be performed, though clustering is more useful with multiple features.


6. Which of the following method is used for finding optimal clusters in K-Mean algorithm?

Options:
a. Elbow method
b. Manhattan method
c. Euclidian method
d. None of these

✅ Answer: a
Explanation: Elbow method helps determine the optimal number of clusters by plotting WCSS and finding the “elbow” point where improvement drops.


7. Movie Recommendation systems are an example of

Options:
a. 2 Only
b. 1 and 2
c. 1 and 3
d. 1, 2, 3 and 4

✅ Answer: d
Explanation: Recommendation systems use Classification (user preference), Clustering (user groups), Reinforcement Learning (learning over time), and Regression (rating prediction).


8. How can Clustering (Unsupervised Learning) be used to improve the accuracy of the Linear Regression model (Supervised Learning):

Options:
a. 1 Only
b. 1 and 2
c. 1 and 4
d. 1, 2, 3 and 4

✅ Answer: d
Explanation: All listed strategies (cluster-based models, using cluster ID, centroids, and sizes) can enhance regression model performance.


9. Let x₁ = (1,2) and x₂ = (3,5) be the coordinates for two objects. The Euclidean and Manhattan distance between these two objects is __ respectively

Options:
a. 4.2 and 3
b. 3.15 and 2
c. 3.61 and 5
d. None of these

✅ Answer: c
Explanation:

  • Euclidean = √[(3–1)² + (5–2)²] = √13 ≈ 3.61
  • Manhattan = |3–1| + |5–2| = 2 + 3 = 5

10. State true or false: Discriminant Analysis does not require the grouping variable to be known at the beginning

Options:
a. True
b. False

✅ Answer: b
Explanation: Discriminant Analysis is supervised and requires prior knowledge of the grouping (class) variable to build a classification model.