【STAT 361 开发总结】STAT 361 (Fall 2021)
Assignment 3
The assignment is due on Nov. 04 (Thursday) at 23:00 (time of Kingston Ontario). Please submit to
Crowd Mark.
Guidelines for Preparing Solutions
For questions that needs R coding, please only include the important R output and the necessary results in
the main text of your solutions. Present them in a clear and concise fashion (for example, tabulate models
and output).
Give descriptions and discussions for your important exploration and findings.
Put long code and output in an Appendix, at the end of EACH problem.
These Appendix sections will NOT be marked, but will be checked as evidence of your independent work.
Prepare your assignment solutions so that it is easy for the readers (in this case, TAs) to follow, without
having to search everywhere for your answers from lengthy code and output.
- Consider the multiple regression model Y = Xβ +, where ~ MVNn(0, σ2
I). See descriptions of model
forms (1) and (2) in Chapter 4.
(a) Show that the residual vector r = (I ? P)Y, where P = X(XT X)?1XT, and show that I ? P is also a
projection matrix.
(b) Let U = (β?, r)T. Find the joint distribution of the random vector U. It may be helpful to notice that.
(c) Show that β? and r are independent.
Hint: For (b) and (c), properties of multivariate normal distribution may be useful. - Consider the “Savings.txt” data posted. It is an economic dataset collected in 48 different countries. The
variable “sr” is ratio of savings (aggregate personal saving divided by disposable income). The variables
“pop15” and “pop75” are percentages of population under 15 and over 75 respectively. The variable “dpi”
is disposable income (per-capita, in dollars) while the variable “ddpi” is the rate (percent) of change in
disposable income (per capita).
(a) Draw scatter plot matrix for all the variables involved. Comment on the possible relationships between
variables, focus on those appear interesting to you.
(b) Fit a simple linear regression model with disposable income (“dpi”) as response and percentage of population
under 15 as the only covariate. Describe the model clearly. Report and interpret the fitted model:
is there a significant association between the variables, is this what you expect?
(c) Fit a regression model with ratio of savings (Y , “sr”) as the response, and all other variables as the
covariates. Describe the model clearly, report and discuss the fit of the model. Interpret the estimated
coefficient for the rate of change in disposable income.
(d) Is it reasonable to drop the covariate disposable income (“dpi”) from the model in (c)? Support your
answer with a test, describe the test procedure and results clearly; also calculate a confidence interval for
the regression coefficient for this covariate.
Added Note: Test at level 0.05, and construct a 95% confidence interval.
(e) Based on the model for (c), obtain a 95% prediction interval for the ratio of savings of a country with
x = (20, 3.2, 2200, 2.1)T
for “pop15”, “pop75”, “dpi”, “ddpi” respectively. - Four objects are weighed 2 at a time on a spring balance. Denote the 4 unknown weights by β1, . . . , β4.
Six observations are made and are expressed in these forms:
Y1 = β1 + β2 + 1,
Y2 = β1 + β3 + 2,
Y3 = β1 + β4 + 3,
Y4 = β2 + β3 + 4,
1
Y5 = β2 + β4 + 5,
Y6 = β3 + β4 + 6.
Assume that i
iid~ N (0, σ2
), i = 1, . . . , 6.
(a) Find expressions for the least squares estimators β1, . . . , β4 (specify the expressions in terms of Y1, . . . , Y6).
(b) Find an expression for Cov(β?) (specify matrix entries, may involve σ2).
(c) Find expressions for the residuals (specify the expressions in terms of Y1, . . . , Y6).
(d) Create a small data set for this study, for (Y1, . . . , Y6) = (5, 8, 6, 7, 10, 9). Use lm() function in R to fit
the data. Check the results for (a), (b) and (c). Does the output from lm() fit agree with the corresponding
calculation results for the data set based on the expressions you derived above?
(e) Explain how you will construct a 95% confidence interval for β1 + β2. We can still use the tn?k distribution.
Find the confidence interval for the given data.
推荐阅读
- #|算法理论——快速幂思想(附例题)
- 面试|理解JS的三座大山
- 面试|瑞吉外卖项目剩余功能补充
- java|查看MySQL初始密码并修改
- java|docker启动容器之后马上又自动关闭解决办法
- java|kali linux 初始密码
- Spring的Factories机制介绍
- 音乐平台开发记录|基于springboot+vue(thymeleaf)+mysql下的自创音乐网站平台--CrushMusic(开发日志十)
- 后端|MySQL 灵魂 16 问,你能撑到第几问()