Learning Goals

At the end of this exercise, you will be able to:
1. Review how to make barplots, scatterplots, and boxplots using ggplot.
2. Use aesthetics to improve readability of plots.

Load the libraries

library("tidyverse")
library("janitor")

Review

We have already learned the basics of ggplot, but let’s review the plot types we have learned thus far. We will use the mammal life history data to practice. The data are from: S. K. Morgan Ernest. 2003. Life history characteristics of placental non-volant mammals. Ecology 84:3402.

life_history <- read_csv("data/mammal_lifehistories_v2.csv", na="-999") %>% 
  clean_names()

Bar Plots

Recall that geom_bar is used when you want to count the number of observations in a categorical variable. geom_col is used when you have already counted the number of observations (or you have a defined x and y) and want to plot those counts.

Make two bar plots showing the number of observations for each order of mammals using both geom types.

geom_bar

life_history %>% 
  ggplot(aes(x=order))+
  geom_bar()+
  coord_flip()

geom_col

life_history %>% 
  count(order, sort=T)
## # A tibble: 17 × 2
##    order              n
##    <chr>          <int>
##  1 Rodentia         665
##  2 Carnivora        197
##  3 Artiodactyla     161
##  4 Primates         156
##  5 Insectivora       91
##  6 Cetacea           55
##  7 Lagomorpha        42
##  8 Xenarthra         20
##  9 Perissodactyla    15
## 10 Macroscelidea     10
## 11 Pholidota          7
## 12 Scandentia         7
## 13 Sirenia            5
## 14 Hyracoidea         4
## 15 Dermoptera         2
## 16 Proboscidea        2
## 17 Tubulidentata      1
life_history %>% 
  count(order, sort=T) %>% 
  ggplot(aes(x=order, y=n))+
  geom_col()+
  coord_flip()

What if we wanted a bar plot of the mean mass for each order?

life_history %>% 
  group_by(order) %>% 
  summarize(mean_mass=mean(mass, na.rm=T)) %>% 
  arrange(desc(mean_mass))
## # A tibble: 17 × 2
##    order          mean_mass
##    <chr>              <dbl>
##  1 Cetacea         9830457.
##  2 Proboscidea     3342500 
##  3 Sirenia         1169400 
##  4 Perissodactyla   694487.
##  5 Artiodactyla     115843.
##  6 Tubulidentata     60000 
##  7 Carnivora         43382.
##  8 Pholidota          7980 
##  9 Xenarthra          7238.
## 10 Primates           5145.
## 11 Hyracoidea         3031.
## 12 Lagomorpha         1702.
## 13 Dermoptera         1000 
## 14 Rodentia            637.
## 15 Scandentia          389.
## 16 Insectivora         133.
## 17 Macroscelidea       124.
life_history %>% 
  group_by(order) %>% 
  summarize(mean_mass=mean(mass, na.rm=T)) %>% 
  arrange(desc(mean_mass)) %>% 
  ggplot(aes(x=order, y=mean_mass))+
  geom_col()+
  coord_flip()

There are a few problems here. First, the y-axis is in scientific notation. We can fix this by adjusting the options for the session.

options(scipen=999) #cancels scientific notation for the session

Next, the y-axis is not on a log scale. We can fix this by adding scale_y_log10().

life_history %>% 
  group_by(order) %>% 
  summarize(mean_mass=mean(mass, na.rm=T)) %>% 
  arrange(desc(mean_mass)) %>% 
  ggplot(aes(x=order, y=mean_mass))+
  geom_col()+
  coord_flip()+
  scale_y_log10()

Lastly, we can adjust the x-axis labels to make them more readable. We do this using reorder.

life_history %>% 
  group_by(order) %>% 
  summarize(mean_mass=mean(mass, na.rm=T)) %>% 
  arrange(desc(mean_mass)) %>% 
  ggplot(aes(x=reorder(order, mean_mass), y=mean_mass))+
  geom_col()+
  coord_flip()+
  scale_y_log10()

Scatterplots

Scatter plots allow for comparisons of two continuous variables. Make a scatterplot that compares gestation time and wean mass.

life_history %>% 
  ggplot(aes(x=gestation, y=wean_mass))+
  geom_point(na.rm=T)+
  scale_y_log10()+
  geom_smooth(method=lm, se=F, na.rm=T)
## `geom_smooth()` using formula = 'y ~ x'

Boxplots

Box plots are used to visualize a range of values. So, on the x-axis we have a categorical variable and the y-axis is the range. Make a box plot that compares mass across orders.

life_history %>% 
  ggplot(aes(x=order, y=log10(mass)))+ #another way of scaling
  geom_boxplot(na.rm=T)+
  coord_flip()

Aesthetics: Labels

Now that we have practiced scatter plots, bar plots, and box plots we need to learn how to adjust their appearance to suit our needs. Let’s start with labeling x and y axes.

names(life_history)
##  [1] "order"        "family"       "genus"        "species"      "mass"        
##  [6] "gestation"    "newborn"      "weaning"      "wean_mass"    "afr"         
## [11] "max_life"     "litter_size"  "litters_year"

Is there a relationship between mass and litter size; i.e. do larger mammals have more offspring?

life_history %>% 
  ggplot(aes(x=mass, y=litter_size))+
  geom_point(na.rm=T)+
  scale_x_log10()+
  geom_smooth(method=lm, se=F, na.rm=T)
## `geom_smooth()` using formula = 'y ~ x'

The plot looks clean, but it is incomplete. A reader unfamiliar with the data might have a difficult time interpreting the labels. To add custom labels, we use the labs command.

life_history %>% 
  ggplot(aes(x=mass, y=litter_size))+
  geom_point(na.rm=T)+
  scale_x_log10()+
  geom_smooth(method=lm, se=F, na.rm=T)+
  labs(title="Mass vs. Litter Size",
       x="Mass (g)",
       y="Litter Size")
## `geom_smooth()` using formula = 'y ~ x'

We can adjust the plot further by specifying the size and face of the text. We do this using theme(). The rel() option changes the relative size of the title to keep things consistent. Adding hjust controls the title position.

life_history %>% 
  ggplot(aes(x=mass, y=litter_size))+
  geom_point(na.rm=T)+
  scale_x_log10()+
  geom_smooth(method=lm, se=F, na.rm=T)+
  labs(title="Mass vs. Litter Size",
       x="Mass (g)",
       y="Litter Size")+
  theme(plot.title=element_text(size=rel(2), hjust=.5))
## `geom_smooth()` using formula = 'y ~ x'

That’s it! Let’s take a break and then move on to part 2!

–>Home