---
title: "IS5 in R: Paired Samples and Blocks (Chapter 18)"
author: "Nicholas Horton (nhorton@amherst.edu)"
date: "2025-01-20"
date-format: iso
format: pdf
toc: true
editor: source
---

```{r}
#| label: setup
#| include: false
library(mosaic)   
library(tidyverse)
```


## Introduction and background 

This document is intended to help describe how to undertake analyses introduced as examples in the Fifth Edition of *Intro Stats* (2018) by De Veaux, Velleman, and Bock.
This file as well as the associated Quarto reproducible analysis source file used to create it can be found at http://nhorton.people.amherst.edu/is5.

This work leverages initiatives undertaken by Project MOSAIC (http://www.mosaic-web.org), an NSF-funded effort to improve the teaching of statistics, calculus, science and computing in the undergraduate curriculum.
In particular, we utilize the `mosaic` package, which was written to simplify the use of R for introductory statistics courses.
A short summary of the R needed to teach introductory statistics can be found in the mosaic package vignettes (https://cran.r-project.org/web/packages/mosaic).
A paper describing the mosaic approach was published in the *R Journal*: https://journal.r-project.org/archive/2017/RJ-2017-024.

We begin by loading packages that will be required for our analyses.

```{r}
library(mosaic)
library(tidyverse)
library(tidyr) # for the pivot_longer() function
```

## Chapter 18: Paired Samples and Blocks

```{r}
#| message: false
#| warning: false
Dexterity <- read_csv("http://nhorton.people.amherst.edu/is5/data/Dexterity.csv") |>
  janitor::clean_names()
```

By default, the `read_csv()` function prints the variable names.
These messages have been suppressed using the `message: false` code chunk option to save space and improve readability. 
Here we use the `clean_names()` function from the `janitor` package to sanitize the names of the columns (which would otherwise contain special characters or whitespace). 

```{r}
Dexterity |>
  select(age_months, dominant_6, non_dominant_2, gender) |>
  head(n = 7)
```

### Section 18.1: Paired Data

```{r}
#| warning: false
# Figure 18.1
Dexterity |>
  select(dominant_6, non_dominant_2) |>
  rename(dominant = dominant_6, non_dominant = non_dominant_2) |>
  tidyr::pivot_longer(
    dominant:non_dominant,
    names_to = "hand_type", 
    values_to = "speed"
  ) |>
  gf_boxplot(speed ~ hand_type, fill = ~ hand_type) |>
  gf_labs(x = "", y = "Speed (cyl/sec)") +
  ylim(0.2, 0.9) +
  guides(fill = FALSE)
```

The `pivot_longer()` function is used to reshape the data from wide to long format.
This is a powerful and flexible function to convert datasets to the format needed to display or model them.

#### Example 18.1: Identifying Paired Data

We begin by creating the data set on page 586.

```{r}
WorkWeek <- bind_rows(
  data.frame(name = "Jeff", fiveday = 2798, fourday = 2914),
  data.frame(name = "Betty", fiveday = 7724, fourday = 6112),
  data.frame(name = "Roger", fiveday = 7505, fourday = 6177),
  data.frame(name = "Tom", fiveday = 838, fourday = 1102),
  data.frame(name = "Aimee", fiveday = 4592, fourday = 3281),
  data.frame(name = "Greg", fiveday = 8107, fourday = 4997),
  data.frame(name = "Larry G.", fiveday = 1228, fourday = 1695),
  data.frame(name = "Tad", fiveday = 8718, fourday = 6606),
  data.frame(name = "Larry M.", fiveday = 1097, fourday = 1063),
  data.frame(name = "Leslie", fiveday = 8089, fourday = 6392),
  data.frame(name = "Lee", fiveday = 3807, fourday = 3362)
)
WorkWeek
```

Looking at pairwise differences in `Dexterity`.

```{r}
Dexterity |>
  select(dominant_6, non_dominant_2) |>
  mutate(difference = dominant_6 - non_dominant_2) |>
  head(n = 18)
```

### Section 18.2: The Paired *t*-Test

#### Example 18.2: Checking Assumptions and Conditions

We can display the distribution of the differences (see page 588).

```{r}
WorkWeek <- WorkWeek |>
  mutate(difference = fiveday - fourday)
gf_histogram(~ difference, data = WorkWeek, binwidth = 2000, center = 1000) |>
  gf_labs(x = "Difference (mi)", y = "# of Workers")
```

#### Example 18.3: Doing a Paired *t*-Test

It is straightforward to carry out the paired t-test.

```{r}
t.test(~ difference, data = WorkWeek)
```

or do the same by "hand" within R:

```{r}
nwork <- nrow(WorkWeek)
nwork # number of pairs
dwork <- mean(~ difference, data = WorkWeek)
dwork # mean of differences
swork <- sd(~ difference, data = WorkWeek)
swork # SD of differences

sework <- swork / (nwork^.5)
sework # SE of differences

twork <- (dwork - 0) / sework
twork # t stat

2 * xpt(twork, df = nwork - 1, lower.tail = FALSE)
```

The `xpt()` function finds the p-value and plots it on a graph to visualize it.
Here, the visualization shows a one-sided test, but in the book, it is two sided.

### Section 18.3: Confidence Intervals for Matched Pairs

We begin by reading the data.

```{r}
#| message: false
Couples <- read_csv("http://nhorton.people.amherst.edu/is5/data/Couples.csv") |>
  filter(wAge != "*") |>
  mutate(wAge = as.numeric(wAge))
# table on page 592
Couples |>
  select(wAge, hAge) |>
  mutate(difference = hAge - wAge) |>
  head(n = 7)
```

#### Step-By-Step Example: A Paired *t*-Interval

We replicate the example from page 593.

```{r}
DexData <- Dexterity |>
  select(dominant_6, non_dominant_2) |>
  mutate(difference = dominant_6 - non_dominant_2) |>
  filter(dominant_6 < 1)
# For some reason, the book has removed one observation where dominant_6 = 1,
# but has kept the count of children at 93 instead of 92
gf_histogram(~ difference, data = DexData, binwidth = .05, center = .025) |>
  gf_labs(x = "Dominant-Non-dominant", y = "# Subjects")
```

Here we display the calculations using the `t.test()` function and then by hand.


```{r}
df_stats(~ difference, data = DexData)
t.test(~ difference, data = DexData)
```

```{r}
ndex <- nrow(DexData) + 1 # the book kept n at 93 for some reason
ndex # number of pairs (children)
ddex <- mean(~ difference, data = DexData)
ddex # mean difference
sdex <- sd(~ difference, data = DexData)
sdex # standard deviation of the differences
sedex <- sdex / (ndex^.5)
sedex # standard error of the differences
df <- ndex - 1
df # this matches the book, but it should be 91
tstats <- qt(p = c(.025, .975), df = df)
tstats
medex <- tstats * sedex
medex # margin of error of the differences

ddex + medex

# Or, if you don't want to go through all those calculations:
t.test(~ difference, data = DexData, df = df)
```

#### Effect Size  

#### Example 18.4: Looking at Effect Size with a Paired *t* Confidence Interval

We can verify the calculations from the example.

```{r}
tstats <- qt(p = c(.025, .975), df = nwork - 1)
tstats

me <- tstats * sework
me # margin of error

dwork + me # confidence interval
```

### Section 18.4: Blocking

#### What's Independent?

#### Random Matters: A Bootstrapped Paired Data Confidence Interval and Hypothesis Test

Our usual approach to bootstrapping works here.

```{r}
set.seed(2345)
num_sim <- 5000

# What does do() do?
mean(~ difference, data = resample(DexData)) # One mean of a random resample
mean(~ difference, data = resample(DexData)) # Another mean of a random resample

do(2) * mean(~ difference, data = resample(DexData)) # Calculates two means

# We need num_sim means
DexBoots <- do(num_sim) * mean(~ difference, data = resample(DexData))
```

For more information about `resample()`, refer to the resampling vignette: https://cran.r-project.org/web/packages/mosaic/vignettes/Resampling.html


```{r}
#| warning: false
qdata(~ mean, p = c(.025, .975), data = DexBoots)
DexBoots <- DexBoots |>
  mutate(interval = ifelse(
    mean > 0.0245 & mean < 0.0783, 
    "Within 95% Confidence",
    "Outside 95% Confidence"
  ))

# Figure 18.4, page 597
gf_histogram(
  ~ mean, 
  fill = ~interval, 
  data = DexBoots, 
  binwidth = 0.002, 
  center = 0.001
) |>
  gf_vline(xintercept = 0.0245) |>
  gf_vline(xintercept = 0.0783) |>
  gf_labs(x = "Differences of Bootstrapped Means", y = "# of Trials") +
  guides(fill = FALSE)

# Figure 18.5
gf_histogram(~ (mean - ddex), data = DexBoots, binwidth = .002, center = .001) |>
  gf_vline(xintercept = ddex) |>
  gf_vline(xintercept = -ddex) |>
  gf_labs(x = "Differences of Bootstrapped Means - Actual Difference", y = "# of Trials")
```

```{r}
df_stats(~ (mean - ddex), data = DexBoots)
```

With `df_stats()`, we can see that our minimum is within the interval, but our maximum isn't.

```{r}
DexBoots |>
  filter((mean - ddex) > ddex)
```

Like the book, there is one instance (out of 5,000), so we estimate the P-value as 1/5,000 (the book says 50,000, which is incorrect), or .0002.