At the end of this exercise, you will be able to:
1. Review how to make barplots, scatterplots, and boxplots using
ggplot.
2. Use aesthetics to improve readability of plots.
library("tidyverse")
library("janitor")
We have already learned the basics of ggplot, but let’s review the plot types we have learned thus far. We will use the mammal life history data to practice. The data are from: S. K. Morgan Ernest. 2003. Life history characteristics of placental non-volant mammals. Ecology 84:3402.
life_history <- read_csv("data/mammal_lifehistories_v2.csv", na="-999") %>%
clean_names()
Recall that geom_bar is used when you want to count the
number of observations in a categorical variable. geom_col
is used when you have already counted the number of observations (or you
have a defined x and y) and want to plot those counts.
Make two bar plots showing the number of observations for each order of mammals using both geom types.
geom_bar
life_history %>%
ggplot(aes(x=order))+
geom_bar()+
coord_flip()
geom_col
life_history %>%
count(order, sort=T)
## # A tibble: 17 × 2
## order n
## <chr> <int>
## 1 Rodentia 665
## 2 Carnivora 197
## 3 Artiodactyla 161
## 4 Primates 156
## 5 Insectivora 91
## 6 Cetacea 55
## 7 Lagomorpha 42
## 8 Xenarthra 20
## 9 Perissodactyla 15
## 10 Macroscelidea 10
## 11 Pholidota 7
## 12 Scandentia 7
## 13 Sirenia 5
## 14 Hyracoidea 4
## 15 Dermoptera 2
## 16 Proboscidea 2
## 17 Tubulidentata 1
life_history %>%
count(order, sort=T) %>%
ggplot(aes(x=order, y=n))+
geom_col()+
coord_flip()
What if we wanted a bar plot of the mean mass for each order?
life_history %>%
group_by(order) %>%
summarize(mean_mass=mean(mass, na.rm=T)) %>%
arrange(desc(mean_mass))
## # A tibble: 17 × 2
## order mean_mass
## <chr> <dbl>
## 1 Cetacea 9830457.
## 2 Proboscidea 3342500
## 3 Sirenia 1169400
## 4 Perissodactyla 694487.
## 5 Artiodactyla 115843.
## 6 Tubulidentata 60000
## 7 Carnivora 43382.
## 8 Pholidota 7980
## 9 Xenarthra 7238.
## 10 Primates 5145.
## 11 Hyracoidea 3031.
## 12 Lagomorpha 1702.
## 13 Dermoptera 1000
## 14 Rodentia 637.
## 15 Scandentia 389.
## 16 Insectivora 133.
## 17 Macroscelidea 124.
life_history %>%
group_by(order) %>%
summarize(mean_mass=mean(mass, na.rm=T)) %>%
arrange(desc(mean_mass)) %>%
ggplot(aes(x=order, y=mean_mass))+
geom_col()+
coord_flip()
There are a few problems here. First, the y-axis is in scientific notation. We can fix this by adjusting the options for the session.
options(scipen=999) #cancels scientific notation for the session
Next, the y-axis is not on a log scale. We can fix this by adding
scale_y_log10().
life_history %>%
group_by(order) %>%
summarize(mean_mass=mean(mass, na.rm=T)) %>%
arrange(desc(mean_mass)) %>%
ggplot(aes(x=order, y=mean_mass))+
geom_col()+
coord_flip()+
scale_y_log10()
Lastly, we can adjust the x-axis labels to make them more readable.
We do this using reorder.
life_history %>%
group_by(order) %>%
summarize(mean_mass=mean(mass, na.rm=T)) %>%
arrange(desc(mean_mass)) %>%
ggplot(aes(x=reorder(order, mean_mass), y=mean_mass))+
geom_col()+
coord_flip()+
scale_y_log10()
Scatter plots allow for comparisons of two continuous variables. Make a scatterplot that compares gestation time and wean mass.
life_history %>%
ggplot(aes(x=gestation, y=wean_mass))+
geom_point(na.rm=T)+
scale_y_log10()+
geom_smooth(method=lm, se=F, na.rm=T)
## `geom_smooth()` using formula = 'y ~ x'
Box plots are used to visualize a range of values. So, on the x-axis we have a categorical variable and the y-axis is the range. Make a box plot that compares mass across orders.
life_history %>%
ggplot(aes(x=order, y=log10(mass)))+ #another way of scaling
geom_boxplot(na.rm=T)+
coord_flip()
Now that we have practiced scatter plots, bar plots, and box plots we need to learn how to adjust their appearance to suit our needs. Let’s start with labeling x and y axes.
names(life_history)
## [1] "order" "family" "genus" "species" "mass"
## [6] "gestation" "newborn" "weaning" "wean_mass" "afr"
## [11] "max_life" "litter_size" "litters_year"
Is there a relationship between mass and litter size; i.e. do larger mammals have more offspring?
life_history %>%
ggplot(aes(x=mass, y=litter_size))+
geom_point(na.rm=T)+
scale_x_log10()+
geom_smooth(method=lm, se=F, na.rm=T)
## `geom_smooth()` using formula = 'y ~ x'
The plot looks clean, but it is incomplete. A reader unfamiliar with
the data might have a difficult time interpreting the labels. To add
custom labels, we use the labs command.
life_history %>%
ggplot(aes(x=mass, y=litter_size))+
geom_point(na.rm=T)+
scale_x_log10()+
geom_smooth(method=lm, se=F, na.rm=T)+
labs(title="Mass vs. Litter Size",
x="Mass (g)",
y="Litter Size")
## `geom_smooth()` using formula = 'y ~ x'
We can adjust the plot further by specifying the size and face of the
text. We do this using theme(). The rel()
option changes the relative size of the title to keep things consistent.
Adding hjust controls the title position.
life_history %>%
ggplot(aes(x=mass, y=litter_size))+
geom_point(na.rm=T)+
scale_x_log10()+
geom_smooth(method=lm, se=F, na.rm=T)+
labs(title="Mass vs. Litter Size",
x="Mass (g)",
y="Litter Size")+
theme(plot.title=element_text(size=rel(2), hjust=.5))
## `geom_smooth()` using formula = 'y ~ x'
–>Home