\documentclass[11pt]{article} \usepackage[margin=1in,bottom=.5in,includehead,includefoot]{geometry} \usepackage{hyperref} \usepackage{language} \usepackage{alltt} \usepackage{fancyhdr} \pagestyle{fancy} \fancyhf{} %% Now begin customising things. See the fancyhdr docs for more info. \chead{} \lhead[\sf \thepage]{\sf \leftmark} \rhead[\sf \leftmark]{\sf \thepage} \lfoot{} \cfoot{Introduction to the Practice of Statistics using R: Chapter 5} \rfoot{} \newcounter{myenumi} \newcommand{\saveenumi}{\setcounter{myenumi}{\value{enumi}}} \newcommand{\reuseenumi}{\setcounter{enumi}{\value{myenumi}}} \pagestyle{fancy} \def\R{{\sf R}} \def\Rstudio{{\sf RStudio}} \def\RStudio{{\sf RStudio}} \def\term#1{\textbf{#1}} \def\tab#1{{\sf #1}} \usepackage{relsize} \newlength{\tempfmlength} \newsavebox{\fmbox} \newenvironment{fmpage}[1] { \medskip \setlength{\tempfmlength}{#1} \begin{lrbox}{\fmbox} \begin{minipage}{#1} \vspace*{.02\tempfmlength} \hfill \begin{minipage}{.95 \tempfmlength}} {\end{minipage}\hfill \vspace*{.015\tempfmlength} \end{minipage}\end{lrbox}\fbox{\usebox{\fmbox}} \medskip } \newenvironment{boxedText}[1][.98\textwidth]% {% \begin{center} \begin{fmpage}{#1} }% {% \end{fmpage} \end{center} } \newenvironment{boxedTable}[2][tbp]% {% \begin{table}[#1] \refstepcounter{table} \begin{center} \begin{fmpage}{.98\textwidth} \begin{center} \sf \large Box~\expandafter\thetable. #2 \end{center} \medskip }% {% \end{fmpage} \end{center} \end{table} % need to do something about exercises that follow boxedTable } \newcommand{\cran}{\href{http://www.R-project.org/}{CRAN}} \title{Introduction to the Practice of Statistics using R: \\ Chapter 5} \author{ Nicholas J. Horton\thanks{Department of Mathematics, Amherst College, nhorton@amherst.edu} \and Ben Baumer } \date{\today} \begin{document} \maketitle \tableofcontents %\parindent=0pt <>= opts_chunk$set( dev="pdf", tidy=FALSE, fig.path="figures/", fig.height=4, fig.width=5, out.width=".57\\textwidth", fig.keep="high", fig.show="hold", fig.align="center", prompt=TRUE, # show the prompts; but perhaps we should not do this comment=NA ) options(continue=" ") @ <>= print.pval = function(pval) { threshold = 0.0001 return(ifelse(pval < threshold, paste("p<", sprintf("%.4f", threshold), sep=""), ifelse(pval > 0.1, paste("p=",round(pval, 2), sep=""), paste("p=", round(pval, 3), sep="")))) } @ <>= require(mosaic) trellis.par.set(theme=col.mosaic()) # get a better color scheme for lattice set.seed(123) # this allows for code formatting inline. Use \Sexpr{'function(x,y)'}, for exmaple. knit_hooks$set(inline = function(x) { if (is.numeric(x)) return(knitr:::format_sci(x, 'latex')) x = as.character(x) h = knitr:::hilight_source(x, 'latex', list(prompt=FALSE, size='normalsize')) h = gsub("([_#$%&])", "\\\\\\1", h) h = gsub('(["\'])', '\\1{}', h) gsub('^\\\\begin\\{alltt\\}\\s*|\\\\end\\{alltt\\}\\s*$', '', h) }) showOriginal=FALSE showNew=TRUE @ \section*{Introduction} This document is intended to help describe how to undertake analyses introduced as examples in the Sixth Edition of \emph{Introduction to the Practice of Statistics} (2009) by David Moore, George McCabe and Bruce Craig. More information about the book can be found at \url{http://bcs.whfreeman.com/ips6e/}. This file as well as the associated \pkg{knitr} reproducible analysis source file can be found at \url{http://www.math.smith.edu/~nhorton/ips6e}. This work leverages initiatives undertaken by Project MOSAIC (\url{http://www.mosaic-web.org}), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the \pkg{mosaic} package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignette (\url{http://cran.r-project.org/web/packages/mosaic/vignettes/MinimalR.pdf}). To use a package within R, it must be installed (one time), and loaded (each session). The package can be installed using the following command: <>= install.packages('mosaic') # note the quotation marks @ The {\tt \#} character is a comment in R, and all text after that on the current line is ignored. Once the package is installed (one time only), it can be loaded by running the command: <>= require(mosaic) @ This needs to be done once per session. We also set some options to improve legibility of graphs and output. <>= trellis.par.set(theme=col.mosaic()) # get a better color scheme for lattice options(digits=3) @ The specific goal of this document is to demonstrate how to replicate the analysis described in Chapter 5: Sampling Distributions. \section{Sampling distributions for counts and proportions} Calculations with the binomial distribution can be undertaken using the {\tt pbinom()} and {\tt dbinom()} functions. For example, the results from Figure 5.1 (page 317) can be reproduced. <<>>= dbinom(10, size=150, prob=0.08) pbinom(10, size=150, prob=0.08) @ The table Figure 5.2 (page 318) can be reproduced using the following command: <<>>= cbind(0:9, dbinom(0:9, size=15, p=0.08)) @ And the calculation on page 318 using the command: <<>>= sum(dbinom(0:1, size=15, prob=0.08)) @ or using {\tt pbinom()}: <<>>= pbinom(1, size=15, prob=0.08) @ Example 5.9 (pages 318-319) can be calculated: <<>>= 1 - pbinom(4, size=12, prob=0.25) @ Using R, there is little need for the normal approximation to the binomial. As an example, the probability of interest in Example 5.11 (page 321) can be calculated using the command: <<>>= 1 - pbinom(1449, size=2500, prob=0.6) @ Example 5.13 (page 324) uses the normal approximation nonetheless: <<>>= xpnorm(-2.04, mean=0, sd=1) @ A similar calculation is done in Example 5.14 (page 325): <<>>= xpnorm(-0.60, mean=0, sd=1) @ We can compare this to the exact calculation: <<>>= pbinom(10, size=150, prob=0.08) @ The approximation isn't great (which is a good reason not to use it). The binomial probability formula (Example 5.16, page 329) can be used to calculate probabilities, or {\tt dbinom()} and {\tt pbinom()} can do the trick: <<>>= dbinom(0, size=15, prob=0.08) dbinom(1, size=15, prob=0.08) dbinom(0, size=15, prob=0.08) + dbinom(1, size=15, prob=0.08) pbinom(1, size=15, prob=0.08) @ \section{Sampling distributions for a sample mean} Similar approaches are used for sampling distributions for a sample mean, using {\tt xpnorm()}. For instance, Example 5.24 (page 343) can be found using: <<>>= xpnorm(0, mean=10, sd=12.8) @ or <<>>= xpnorm(-0.78, mean=0, sd=1) @ \end{document}