11. Unusual and Influential Data – 11.1 Outlier, Leverage, and Influence

By | 2013년 5월 22일
library(car)
data(Davis)
davis<-cbind(Davis,"sex_cd"=ifelse(davis$sex=='M', 0,1))

head(davis)

reg<-lm(repwt~weight*sex_cd, data=davis)
summary(reg)
> summary(reg)

Call:
lm(formula = repwt ~ weight * sex_cd, data = davis)

Residuals:
     Min       1Q   Median       3Q      Max
-29.2230  -2.3247  -0.1325   2.0741  15.5783

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept)    1.35864    3.27719   0.415    0.679
weight         0.98982    0.04260  23.236   <2e-16 ***
sex_cd        39.96412    3.92932  10.171   <2e-16 ***
weight:sex_cd -0.72536    0.05598 -12.957   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.661 on 179 degrees of freedom
  (17 observations deleted due to missingness)
Multiple R-squared: 0.8874,     Adjusted R-squared: 0.8856
F-statistic: 470.4 on 3 and 179 DF,  p-value: < 2.2e-16
par(mfrow=c(1,2))
plot(davis$weight, davis$repwt, type="n", xlab="Measured Weight(Kg)", ylab="Reported Weight(Kg)", main="(a)")
points(davis[davis$sex=="M",]$weight,davis[davis$sex=="M",]$repwt, pch="M", col="red")
points(davis[davis$sex=="F",]$weight,davis[davis$sex=="F",]$repwt, pch="F", col="blue")
abline(lm(davis[davis$sex=="M",]$repwt~davis[davis$sex=="M",]$weight), col="red")
abline(lm(davis[davis$sex=="F",]$repwt~davis[davis$sex=="F",]$weight), col="blue", lty=2)
text(locator(1), "Male")
text(locator(1), "Female")

plot(davis$repwt, davis$weight, type="n", xlab="Measured Weight(Kg)", ylab="Reported Weight(Kg)", main="(b)")
points(davis[davis$sex=="M",]$repwt,davis[davis$sex=="M",]$weight, pch="M", col="red")
points(davis[davis$sex=="F",]$repwt,davis[davis$sex=="F",]$weight, pch="F", col="blue")
abline(lm(davis[davis$sex=="M",]$weight~davis[davis$sex=="M",]$repwt), col="red")
abline(lm(davis[davis$sex=="F",]$weight~davis[davis$sex=="F",]$repwt), col="blue", lty=2)

Figure 11.2 Regression for Davis’s data on reported and measured weight for women(F) and men(M). Panel(a) shows the least-squares linear regression line for each group (the solid line for men, the broken line for women) for the regression of reported on measured weight. The outlying observation has a large impact on the fitted line for women. Panel(b) shows the fitted regression lines for the regression of measured on reported weight; here, the outlying observation makes little difference to the fit, and the least-squares lines for men and women are nearly the same.

 

댓글 남기기

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다

This site uses Akismet to reduce spam. Learn how your comment data is processed.