SDM4 in R: Inferences about Means (Chapter 20)

Chapter 20: Inferences about Means

Section 20.1: The Central Limit Theorem

Let’s begin by reproducing the figure on the bottom of page 519.

mu <- 1309
sd <- 15.7
xpnorm(c(mu-3*sd, mu-2*sd, mu-sd, mu+sd, mu+2*sd, mu+3*sd), mean=mu, sd=sd)

## 
## If X ~ N(1309, 15.7), then 
## 
##  P(X <= 1262) = P(Z <= -3) = 0.00135
##      P(X <= 1278) = P(Z <= -2) = 0.02275
##      P(X <= 1293) = P(Z <= -1) = 0.15866
##      P(X <= 1325) = P(Z <=  1) = 0.84134
##      P(X <= 1340) = P(Z <=  2) = 0.97725
##      P(X <= 1356) = P(Z <=  3) = 0.99865
##  P(X >  1262) = P(Z >  -3) = 0.99865
##      P(X >  1278) = P(Z >  -2) = 0.97725
##      P(X >  1293) = P(Z >  -1) = 0.84134
##      P(X >  1325) = P(Z >   1) = 0.15866
##      P(X >  1340) = P(Z >   2) = 0.02275
##      P(X >  1356) = P(Z >   3) = 0.00135

## [1] 0.00135 0.02275 0.15866 0.84134 0.97725 0.99865

Section 20.2: Gosset’s t

Figure 20.1 (page 521) displays a normal curve (dashed green curve) and a t-model with 2 degrees of freedom (solid blue curve).

plotDist("norm", lty=2, col="green", lwd=2)
plotDist("t", params=2, lty=1, lwd=2, col="blue", add=TRUE)

We can reproduce the calculations for the Farmed salmon example (pages 523-524) using summary statistics:

n <- 150; ybar <- 0.0913; s = 0.0495
tstar <- qt(0.975, df=n-1); tstar

## [1] 1.98

ybar + c(-tstar, tstar)*s/sqrt(n)

## [1] 0.0833 0.0993

or directly:

Salmon <- read.csv("http://nhorton.people.amherst.edu/sdm4/data/Farmed_Salmon.csv")
favstats(~ Mirex, data=Salmon)

##  min    Q1 median    Q3   max   mean     sd   n missing
##    0 0.056  0.079 0.135 0.194 0.0913 0.0495 150       3

histogram(~ Mirex, width=0.01, center=0.01/2, data=Salmon)

t.test(~ Mirex, data=Salmon)

## 
##  One Sample t-test
## 
## data:  Salmon$Mirex
## t = 20, df = 100, p-value <2e-16
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  0.0833 0.0993
## sample estimates:
## mean of x 
##    0.0913

We note that the distribution of measurements is not particularly normal.

Section 20.4: A hypothesis test for the mean

We can carry out the one-sided test outlined on page 530:

tval <- (.0913-0.08)/0.0040; tval

## [1] 2.83

1-xpt(tval, df=149)

## [1] 0.00269

SDM4 in R: Inferences about Means (Chapter 20)

Nicholas Horton (nhorton@amherst.edu)

January 2, 2017

Introduction and background

Chapter 20: Inferences about Means

Section 20.1: The Central Limit Theorem

Section 20.2: Gosset’s t

Section 20.4: A hypothesis test for the mean