--- title: "ggplot2 example: NASA weather (sample solution)" author: "Nicholas Horton (nhorton@amherst.edu)" date: "June 12, 2021" output: pdf_document: fig_height: 7 fig_width: 8 html_document: fig_height: 3 fig_width: 5 word_document: fig_height: 3 fig_width: 5 --- ```{r, include=FALSE} library(tidyverse) library(knitr) opts_chunk$set( tidy=FALSE, # display code as typed size="small" # slightly smaller font for code ) ``` ## Introduction This document is intended to provide you an opportunity to practice generating visualizations using `ggplot2`. Let's begin by exploring the dataset. ```{r message = FALSE} library(tidyverse) library(nasaweather) ``` ```{r} glimpse(storms) ``` We see that there are nearly 3,000 rows, each of which includes an update every six hours of the date, location, pressure, windspeed, type, and name of storm. ### Simple scatterplot First, create a scatterplot between `wind` (y-axis) and `pressure` (x-axis). using the `geom_point()` function. ```{r} ggplot( data = storms, aes(x = pressure, y = wind) ) + geom_point() ``` ### Distinguishing type of storm Next, recreate the scatterplot between `wind` and `pressure`, but add color to distinguish the `type` of storm (remember that this is specified in the call to `aes()`). ```{r} ggplot( data = storms, aes(x = pressure, y = wind, color = type) ) + geom_point() ``` ### Adjusting alpha levels The plot is okay, but many points are overplotting. Regenerate the plot after adding the option `alpha = 0.50` to the `geom_point()` call. ```{r} ggplot( data = storms, aes(x = pressure, y = wind, color = type) ) + geom_point(alpha = 0.5) ``` ### Jittering points It's easier to see now where there are multiple points at a given location. By `jittering` points we can a bit of random noise to make it easier to see multiple points. Regenerate the plot by specifying `jitter(wind)` instead of `wind`. ```{r} ggplot( data = storms, aes(x = pressure, y = jitter(wind), color = type) ) + geom_point(alpha = 0.5) ``` ### Cleaning up labels It's important to spend some time improving our plot so that it can stand on its own. One important aspect are axis labels. Check out the documentation for the `labs()` function and the `storms` dataset to find the units for wind and pressure and to specify x, y, and title labels. ```{r} ggplot( data = storms, aes(x = pressure, y = jitter(wind), color = type) ) + geom_point(alpha = 0.5) + labs( x = "air pressure at storm center (millibars)", y = "maximum sustained wind speed (knots)", title = "Association between pressure and wind speed" ) ``` ### Challenge The following plot demonstrates a more advanced use of `ggplot2`. It takes data from the `nasaweather` package and the `geom_path()` function to plot the path of each tropical storm in the `storms` data table. Can you find a way to improve this display? ```{r} bbox <- storms %>% select(lat, long) %>% map(range) %>% bind_rows() base_map <- ggplot(data = map_data("world"), aes(x = long, y = lat)) + geom_path(aes(group = group), color = "black", size = 0.1) + lims(x = pull(bbox, long), y = pull(bbox, lat)) storms <- storms %>% mutate(the_date = lubridate::ymd(paste(year, month, day))) base_map + geom_path( data = storms, aes(color = name, alpha = 0.1), arrow = arrow(length = unit(0.05, "inches")) ) + facet_wrap(~year) + theme(legend.position = "none") ```