NPTEL Data Analytics with Python Week 6 Assignment Answers 2024
1. In regression analysis, which of the following is not a required assumption about the error term?
a. The expected value of the error term is one
b. The variance of the error term is the same for all values of X
c. The values of the error term are independent
d. The error term is normally distributed
Answer: a
Explanation: In regression, we assume the expected value of the error term is zero, not one. Other assumptions listed here are valid assumptions for classical linear regression.
2. A regression analysis between sales (Y in $1000) and advertising (X in dollars) resulted in the following equation:
Y = 30,000 + 5X
a. Increase of $5 in advertising is associated with an increase of $5,000 in sales
b. Increase of $1 in advertising is associated with an increase of $5 in sales
c. Increase of $1 in advertising is associated with an increase of $35,000 in sales
d. Increase of $1 in advertising is associated with an increase of $5,000 in sales
Answer: b
Explanation: The coefficient 5 means for every $1 increase in X (advertising), Y (sales in $1000) increases by 5 units, i.e., $5,000. But Y is already in $1000 units, so $1 in advertising increases sales by $5 (not $5,000 in raw terms).
3. In a regression and correlation analysis if R² = 1, then
a. SSE = SST
b. SSE = 1
c. SSR = SSE
d. SSR = SST
Answer: d
Explanation: R² = SSR / SST. If R² = 1, it means all variability in Y is explained by X, i.e., SSR = SST and SSE = 0.
4. SSE (Sum of Squares for Error) can never be
a. Larger than SST
b. Smaller than SST
c. Equal to 1
d. Equal to zero
Answer: a
Explanation: SST = SSR + SSE. Since SSE is a part of SST, it cannot be larger than SST. It can be zero (perfect fit), equal to 1, or smaller than SST.
5. In question no. 6, when testing the hypothesis of slope, we will:
a. Accept the null hypothesis
b. Reject the null hypothesis
c. Can’t state any conclusion
d. None of the above
Answer: b
Explanation: Since answer 6 is b, it likely indicates a significant result (e.g., t-statistic large, p-value small), so we reject the null hypothesis (typically that slope = 0).
6. In question 6, determine a 95% confidence interval for B1 to test the hypotheses
a. (0.045, 0.138)
b. (0.055, 0.148)
c. (0.065, 0.158)
d. (0.075, 0.138)
Answer: a
Explanation: If option (a) contains values that do not include 0 and aligns with previous calculations, it’s the likely interval. Confidence intervals help verify significance—if 0 is not within the interval, the slope is significant.
7. State TRUE or FALSE –
Statement: The variance of error is same for all values of the independent variable
a. True
b. False
Answer: a
Explanation: This is a basic assumption of homoscedasticity in regression analysis—the error variance should remain constant across all X values.
8. Which of the following is possible for the coefficient of determination (R²)?
a. It can be larger than 1
b. It is less than one
c. It can be less than -1
d. None of these alternatives is correct
Answer: b
Explanation: R² ranges from 0 to 1 in simple regression (0% to 100% variability explained). It can never be greater than 1 or negative.