Data Analytics with Python Week 8 NPTEL Assignment Answers 2025

NPTEL Data Analytics with Python Week 8 Assignment Answers 2024

1. For categorical data with ‘n’ categories, the number of dummy variables will be________
a. n
b. n-1
c. n+1
d. 2n
Answer: b
Explanation: In dummy variable encoding, one category is dropped to avoid multicollinearity. So, we use (n−1) dummy variables for ‘n’ categories.


2. In estimation of regression parameters
a. The likelihood function is a function of only 𝜎
b. The values of 𝛽₀ .. 𝛽ₙ and 𝜎 should be such that, they maximize the likelihood function.
c. Both (a) and (b)
d. None of the above
Answer: b
Explanation: In regression, parameters are estimated using maximum likelihood estimation (MLE) by maximizing the likelihood function with respect to both β’s and σ.


3. In logistic regression, the null hypothesis tested is:
a. H₀: β = 0
b. H₀: β ≠ 0
c. H₀: μ = 0
d. H₀: μ ≠ 0
Answer: a
Explanation: Logistic regression tests if each coefficient (β) is significantly different from zero, implying that the predictor has no effect under the null.


4. In logistic regression,
a. The graph doesn’t follow S shape curve
b. The dependent variable is categorical
c. The estimated value of dependent variable is not probability
d. None of the above
Answer: b
Explanation: Logistic regression is used when the dependent variable is categorical, often binary (0/1). The model output is interpreted as a probability.


5. State true or false: G statistic is used to check the individual significance of the independent variables
a. True
b. False
Answer: a
Explanation: The G-statistic (likelihood ratio test) can be used to test whether an individual predictor significantly improves the model.


6. What is the range of values as output from the sigmoid function?
a. 0 to 1
b. -1 to -1
c. -1 to 0
d. None of these
Answer: a
Explanation: The sigmoid (logistic) function maps any real value to a range between 0 and 1, making it ideal for probability estimation.


7. State True or False: The method of least squares is used to predict the population parameters with any probability distribution.
a. True
b. False
Answer: b
Explanation: Least squares assumes a normal distribution of errors. It is not universally applicable to all probability distributions.


8. Suppose you have been given a fair coin and you want to find out the odds of getting heads. Which of the following option is true for such a case?
a. Odds will be 0
b. Odds will be 0.5
c. Odds will be 1
d. None of these
Answer: c
Explanation: Probability of heads = 0.5, tails = 0.5 → odds = 0.5 / 0.5 = 1.


9. Large values of the log-likelihood statistic indicate:
a. That there are a greater number of explained vs. unexplained observations.
b. That the statistical model fits the data well.
c. That as the predictor variable increases, the likelihood of the outcome occurring decreases.
d. That the statistical model is a poor fit for the data.
Answer: b
Explanation: A higher (less negative) log-likelihood means the model explains the data better, indicating good fit.


10. The logit function (given as l(x)) is the log of odds function. What could be the range of logit function in the domain x = [0,1]?
a. (– ∞ , ∞)
b. (0,1)
c. (0 , ∞)
d. (– ∞, 0)
Answer: a
Explanation: The logit function is log(p/(1−p)); as p ranges from 0 to 1, the logit can range from -∞ to +∞.