--- title: "SDM4 in R: Multifactor Analysis of Variance (Chapter 27)" author: "Nicholas Horton (nhorton@amherst.edu), Patrick Frenett, and Sarah McDonald" date: "June 13, 2018" output: pdf_document: fig_height: 2.8 fig_width: 7 html_document: fig_height: 3 fig_width: 5 word_document: fig_height: 4 fig_width: 6 --- ```{r, include = FALSE} # Don't delete this chunk if you are using the mosaic package # This loads the mosaic and dplyr packages require(mosaic) options(digits = 5) ``` ```{r, include = FALSE} # knitr settings to control how R chunks work. require(knitr) opts_chunk$set( tidy = FALSE, # display code as typed size = "small" # slightly smaller font for code ) ``` ## Introduction and background This document is intended to help describe how to undertake analyses introduced as examples in the Fourth Edition of *Stats: Data and Models* (2014) by De Veaux, Velleman, and Bock. More information about the book can be found at http://wps.aw.com/aw_deveaux_stats_series. This file as well as the associated R Markdown reproducible analysis source file used to create it can be found at http://nhorton.people.amherst.edu/sdm4. This work leverages initiatives undertaken by Project MOSAIC (http://www.mosaic-web.org), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum. In particular, we utilize the `mosaic` package, which was written to simplify the use of R for introductory statistics courses. A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignettes (http://cran.r-project.org/web/packages/mosaic). A paper describing the mosaic approach was published in the *R Journal*: https://journal.r-project.org/archive/2017/RJ-2017-024. ## Chapter 27: Multifactor Analysis of Variance `bwplot` gives the two boxplots from figure 27.1 seen on page 782. ```{r} Darts <- read.csv("http://nhorton.people.amherst.edu/sdm4/data/Ch29_Darts.csv") gf_boxplot(Accuracy ~ Distance, data = Darts) gf_boxplot(Accuracy ~ Hand, data = Darts) ``` ### Section 27.1: A Two-Factor ANOVA Model The `summary` function gives a numerical summary of the linear model seen on page 784. ```{r} lmDarts <- lm(Accuracy ~ Distance + Hand, data = Darts) summary(aov(lmDarts)) ``` ### Section 27.2: Assumptions and Conditions Figures 27.2 and 27.3 showing hand and distance against accuracy (as well as adjusted hand and adjusted distance). Note that the scales on the adjusted plots are centered around 0 rather than the mean as we are working with residuals. ```{r} lmHand <- lm(Accuracy ~ Hand, data = Darts) lmDistance <- lm(Accuracy ~ Distance, data = Darts) gf_boxplot(Accuracy ~ Hand, data = Darts) gf_boxplot(resid(lmDistance) ~ Hand, data = Darts) gf_boxplot(Accuracy ~ Distance, data = Darts) gf_boxplot(resid(lmHand) ~ Distance, data = Darts) mod1 <- lm(Accuracy ~ Distance + Hand + Distance*Hand, data = Darts) plotModel(mod1) ``` ### Section 27.3: Interactions Shown below is the interaction plot (figure 27.9) and ANOVA summary on page 797. ```{r} mod2 <- lm(Accuracy ~ Hand + Distance + Hand*Distance, data = Darts) plotModel(mod2) lmDarts <- lm(Accuracy ~ Distance*Hand, data = Darts) summary(aov(lmDarts)) ```