NPTEL Business Intelligence & Analytics Week 5 Assignment Answers 2025
1. Which of the following options best represents key dimensions of data quality?
- Accuracy, Timeliness, Availability
- Timeliness, Availability, Consistency
- Availability, Consistency, Accuracy
- Timeliness, Accuracy, Consistency
Answer :- For Answers Click Here
2. Which data preprocessing method divides numerical data into finite intervals and assigns labels?
- Normalization
- Discretization
- Compression
- Scaling
Answer :-
3. What does a high Variance Inflation Factor (VIF) value signify in regression analysis?
- Low multicollinearity
- High multicollinearity
- No multicollinearity
- Perfect independence
Answer :-
4. Which term describes a systematic error that consistently skews measurements away from the true or actual value?
- Bias
- Noise
- Precision
- Accuracy
Answer :-
5. Which technique ensures that every data record is systematically included in the test set exactly once across multiple iterations, while the remaining data is used for training?
- Holdout Method
- Random Subsampling
- Cross-validation
- Bootstrap
Answer :-
6. Identify the pair of techniques that effectively reduce the dimensionality or size of a dataset without losing significant information:
- Data cleaning and Data sampling
- Data compression and Data sampling
- Data compression and Data cleaning
- Data integration and Data cleaning
Answer :-
7. A team in India’s Prime Volleyball League wants to predict the revenue generated from ticket sales for an upcoming match based on factors like venue capacity and past ticket sales. Which method should they use?
- Regression analysis
- Classification
- Clustering
- Reinforcement Learning
Answer :-
8. Multicollinearity in regression analysis is generally associated with all of the following except:
- Increased standard errors of regression coefficients.
- Challenges in isolating the effects of individual predictors.
- Enhanced statistical power (reduced p-values) for coefficient tests.
- Instability in coefficient estimates under data perturbations.
Answer :-
9. In a regression analysis, which metric quantifies the proportion of the total variance in the dependent variable explained by the model?
- Adjusted R-squared
- Mean squared error
- Root mean squared error
- Coefficient of determination
Answer :-
10. In a linear regression model, the method of Ordinary Least Squares aims to:
- Minimize the squared differences between observed and predicted values
- Maximize the squared differences between observed and predicted values
- Minimize the absolute differences between observed and predicted values
- Maximize the absolute differences between observed and predicted values
Answer :- For Answers Click Here
11. Which combination best describes the sources of prediction error in a machine learning model, such as y^=f(x)?
- Test data variation, Model overfitting, and Regularization error
- Reducible error, Irreducible error, and Test data variation
- Data normalization, Model generalization, and Reducible error
- Regularization, Overfitting, and Feature selection
Answer :-
12. What does high variance in a machine learning model indicate?
- The model is highly sensitive to small variations in the input features.
- The model is robust to small changes in the input data.
- The model perfectly captures the relationship between inputs and outputs.
- The model has difficulty capturing any correlation between inputs and outputs.
Answer :-
13. Identify the incorrect statement regarding collinearity in a regression context:
- High collinearity can lead to unstable coefficient estimates.
- A high degree of correlation among predictors indicates collinearity.
- Collinearity increases the significance of predictors as shown by higher t-statistics.
- Correlation analysis of predictors can help identify potential collinearity.
Answer :-
14. A weather prediction model consistently underestimates the amount of rainfall because it oversimplifies the relationship between atmospheric features and precipitation. This is an example of a model with:
- High variance
- High bias
- Overfitting tendencies
- Insufficient regularization
Answers :-
15. The metric that calculates the average squared difference between the predicted and actual values is known as _________.
- Least Squared Error (LSE)
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
- Root Mean Squared Error (RMSE)
Answers :- For Answers Click Here