ST3189 - Linear Regression Quiz

1. In the context of linear regression, what does the term 'least squares' refer to?

2. What is the primary assumption about the relationship between predictors and the response in a simple linear regression model?

3. In the linear model \(Y = \beta_0 + \beta_1 X + \epsilon\), what does \(\beta_1\) represent?

4. The Residual Sum of Squares (RSS) is defined as:

5. What does an R-squared (\(R^2\)) value of 0.75 signify?

6. In multiple linear regression, what is the null hypothesis for the F-statistic?

7. What problem does multicollinearity refer to?

8. Which of the following is a potential solution for dealing with non-linearity in a linear regression model?

9. What is an interaction effect in a regression model?

10. What is the purpose of a prediction interval?

11. In the context of linear regression, what is a 'loss function'?

12. What is the primary goal of the 'least squares' approach in linear modelling?

13. In vector/matrix notation for linear regression, \(y = Xw + \epsilon\), what does the matrix \(X\) represent?

14. What is the primary difference between a parametric and a non-parametric statistical learning approach?

15. What is the main disadvantage of a very flexible (complex) statistical learning model?

16. The mean squared error (MSE) is composed of which two quantities?

17. What is the 'bias-variance trade-off'?

18. In the context of linear regression, what is the purpose of the 't-statistic'?

19. What is a 'dummy variable' used for in linear regression?

20. What does the 'Residual Standard Error' (RSE) measure?

21. What is the primary purpose of using subset selection methods like best subset selection or forward stepwise selection?

22. Which of the following is a drawback of best subset selection?

23. How does forward stepwise selection work?

24. What is the main idea behind shrinkage methods like Ridge and Lasso regression?

25. What is the key difference between the penalty used in Ridge regression and Lasso regression?

26. A key advantage of Lasso regression over Ridge regression is that:

27. In Bayesian linear regression, what is the role of the prior distribution?

28. The posterior distribution in Bayesian inference is proportional to:

29. Ridge regression can be viewed as a Bayesian linear regression model with what kind of prior on the coefficients?

30. Lasso regression is equivalent to finding the MAP estimate under which prior distribution for the coefficients?

31. What is a 'confidence interval' in frequentist inference?

32. What is a 'credible interval' in Bayesian inference?

33. What is the primary purpose of using basis functions (e.g., polynomials, splines) in a linear model?

34. What is a 'spline' in the context of regression?

35. What is a key advantage of using a natural cubic spline over a regular cubic spline?

36. What is the main idea behind Generalized Additive Models (GAMs)?

37. What is a potential problem with fitting a linear regression model when the error terms \(\epsilon_i\) are correlated?

38. What is an 'outlier' in a regression context?

39. What is a 'high leverage' point?

40. The Gauss-Markov theorem states that under certain assumptions, the least squares estimator is:

41. What does 'homoscedasticity' mean in the context of linear regression?

42. What is the primary purpose of examining residual plots in regression diagnostics?

43. If you fit a multiple linear regression model and the p-value for the F-statistic is very small, but the p-values for individual coefficients are large, what is a likely cause?

44. What is the 'curse of dimensionality'?

45. In the formula for the least squares estimators \(\hat{\beta} = (X^T X)^{-1}X^T y\), what must be true about the matrix \(X^T X\)?

46. Which of these is NOT a standard assumption of the classical linear regression model?

47. What is the primary difference between a confidence interval and a prediction interval for a given value of X?

48. What does the 'Adjusted R-squared' do that the regular R-squared does not?

49. Which method is considered a 'greedy' approach to subset selection?

50. The elastic net penalty is a combination of which two other penalties?