class: center, middle, inverse, title-slide # Data Visualization ## Making your first plot ### Andrew Irwin,
a.irwin@dal.ca
### Math & Stats, Dalhousie University ### 2021-01-13 (updated: 2021-01-01) --- class: middle # Use the ggplot2 library Do this once (if you haven't done it already): ``` install.packages("tidyverse") ``` Add this line to every R markdown document: ```r library(tidyverse) ``` --- class: middle # Get some data ```r str(mtcars) ``` ``` ## 'data.frame': 32 obs. of 11 variables: ## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... ## $ disp: num 160 160 108 258 360 ... ## $ hp : num 110 110 93 110 175 105 245 62 95 123 ... ## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... ## $ wt : num 2.62 2.88 2.32 3.21 3.44 ... ## $ qsec: num 16.5 17 18.6 19.4 17 ... ## $ vs : num 0 0 1 1 0 1 0 1 1 1 ... ## $ am : num 1 1 1 0 0 0 0 0 0 0 ... ## $ gear: num 4 4 4 3 3 3 3 4 4 4 ... ## $ carb: num 4 4 1 1 2 1 4 2 2 4 ... ``` --- # First plot ```r mtcars %>% ggplot(aes(x = wt, y = mpg)) + geom_point() ``` -- .pull-left[ The "pipe" symbol (`%>%`) is function composition. `f(g(x))` can be written `x %>% g %>% f`. `aes` is a function to define aesthetic associations between features of your plot and variables in the dataset. Parts of a ggplot are added togther with `+` The kind of plot is called its "geometry". A scatterplot is `geom_point`. ] -- .pull-right[ <img src="05-making-your-first-plot_files/figure-html/fig0-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # What if you forget ... ? ```r mtcars %>% ggplot(aes(x = wt, y = mpg)) ``` -- .pull-left[ Try "forgetting" other parts of the code to see what goes wrong. ] .pull-right[ <img src="05-making-your-first-plot_files/figure-html/fig1-1.png" width="672" height="80%" style="display: block; margin: auto;" /> ] --- # Add some colour ```r mtcars %>% ggplot(aes(x = wt, y = mpg, * color = factor(cyl))) + geom_point() ``` -- .pull-left[ `cyl` is a number, so I must turn it into a categorical variable (factor) to get a discrete colour scale. Try using `cyl` instead of `factor(cyl)`. ] .pull-right[ <img src="05-making-your-first-plot_files/figure-html/fig2-1.png" width="768" height="80%" style="display: block; margin: auto;" /> ] --- # Make the text larger ```r mtcars %>% ggplot(aes(x = wt, y = mpg, color = factor(cyl))) + geom_point() + * theme(text = element_text(size = 28)) ``` --- # Make the text larger <img src="05-making-your-first-plot_files/figure-html/fig3-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- # Make the symbols larger ```r mtcars %>% ggplot(aes(x = wt, y = mpg, color = factor(cyl))) + * geom_point(size=3) + theme(text = element_text(size = 28)) ``` -- I'm setting the size of all the symbols, not connecting a variable to the size aesthetic. So don't use `aes`. --- # Make the symbols larger <img src="05-making-your-first-plot_files/figure-html/fig3b-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- # Customize the labels ```r mtcars %>% ggplot(aes(x = wt, y = mpg, color = factor(cyl))) + geom_point(size=3) + theme(text = element_text(size = 28)) + * labs(x = "Car mass (x 1000 pounds)", * y = "Fuel consumption (mpg)", * color = "Number of\nCylinders", * title = "Car road test results", * caption = "from Motor Trend magazine") ``` --- # Customize the labels <img src="05-making-your-first-plot_files/figure-html/fig4-1.png" width="80%" height="80%" style="display: block; margin: auto;" /> --- class: middle # Summary * Start with data * Pipe (`%>%`) into `ggplot` * Define the aesthetics: `aes(x = ..., y = ..., color = ..., shape = ...)` * Define the geometry: `geom_point` shown here, but there are lots more * Customize text --- class: middle # Suggested reading * Course notes: Making your first plot * Healy. Section 2.6. Make your first figure * R4DS. Chapter 3: Data visualization --- class: middle, inverse # Task and Assignment * Try these plotting commands on your own * Assigment 1: Your first plotting exercises * Do Task 4 following Lesson 6 on git and github first --- # Datasets to experiment with * mtcars, iris and many other well-known data in datasets package * penguins in palmerpenguins package * gapminder in gapminder package (but see website too [Gapminder](https://gapminder.org)) * diamonds in ggplot2 package * nycflights13 in dbplyr package Use `str(mtcars)` (or other dataset) and `View(mtcars)` to look at the data. Use ?mtcars to get documentation for the dataset, or use the search in the "Help" panel.