library(ggplot2)
library(fivethirtyeight)
Intro to ggplot2
R packages
To use R packages, they need to be installed (once per computer) and loaded (once per R session). If you haven’t already, you can install all the packages I think we might need for the semester by copy-pasting the code into your Console.
install.packages(c("tidyverse", "babynames", "broom", "coefplot",
"cowplot", "devtools", "drat", "fueleconomy",
"fivethirtyeight", "formatR", "gapminder",
"GGally", "ggforce", "ggraph", "ggrepel",
"ggridges", "graphlayouts", "gridExtra",
"here", "hexbin", "interplot", "janitor",
"margins", "mgcv", "maps", "mapproj",
"nycflights13", "RColorBrewer", "rmarkdown",
"sf", "skimr", "usethis", "viridis", "viridisLite"))
A bunch of red text will scroll by, which mean it’s working! Installing all the packages may take a few minutes; you’ll know when the packages have finished installing when you see the R prompt (>
) return in your console.
Setup
By convention, the first chunk in a Quarto document includes the R packages you want to load. Remember, in order to use an R package you have to run some library()
code every session. You can run it by either pressing the green triangle button (I call it the play button), using the Run menu up at the top of this document, or putting your cursor on the code and pressing Command+Return (Mac) or Ctrl+Enter (PC). Execute these lines of code to load the packages we’ll need for today.
Loading data: Bechdel test
We’re going to start by playing with data collected by the website FiveThirtyEight on movies and the Bechdel test.
To begin, let’s just preview our data. There are a couple ways to do that. One is just to type the name of the data and execute it like a piece of code.
data("bechdel")
Notice that you can page through to see more of the dataset.
Sometimes, people prefer to see their data in a more spreadsheet-like format, and RStudio provides a way to do that. If you click on the name of the dataset in the Environment, you will see it pop up in the same window as your Quarto doc!
Starting to visualize data
Consider
What relationship do you expect to see between movie budget (budget) and domestic gross(domgross)?
Your Turn 1: scatterplot
Run the code on the slide to make a graph. Pay strict attention to spelling, capitalization, and parentheses!
Your Turn 2: boxplots
Replace this scatterplot with one that draws boxplots. Use the cheatsheet. Try your best guess.
ggplot(data = bechdel) + geom_point(aes(x = clean_test, y = budget))
Your Turn 3: histogram
Make a histogram of the budget
variable from bechdel
.
Your Turn 4: histogram binwidth
Try to find a better binwidth
for budget
.
Your Turn 5: barchart
Make a barchart of clean_test
.
Your Turn 6: facets
Run the code on the slide to make a graph.
Your Turn 7: faceting categorical variables
With your partner, brainstorm how you could use faceting to show the relationship between two categorical variables.
Rendering
Let’s try “Rendering” our document. To render, click the Render button (at the top of this document, has a blue arrow icon next to it.) In this class, we will be rendering to HTML, but you can also render to PDF or Word.
Take aways
You can use this code template to make thousands of graphs with ggplot2.
#| eval: false
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(mapping = aes(<MAPPINGS>)) +
<FACET_FUNCTION>(vars(<VARIABLES>))