Answer the following questions and/or complete the exercises in
RMarkdown. Please embed all of your code and push the final work to your
repository. Your report should be organized, clean, and run free from
errors. Remember, you must remove the # for any included
code chunks to run. Any plots must have appropriate titles and
aesthetics.
library("tidyverse")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.0 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
options(scipen = 999) #disable scientific notation
For this homework, we will be using the msleep dataset
from the ggplot2 package. This dataset contains information
about the sleep habits of various mammals. Reference:
V. M. Savage and G. B. West. A quantitative, theoretical framework for
understanding mammalian sleep. Proceedings of the National Academy of
Sciences, 104 (3):1051-1056, 2007.
1. Use a summary function of your choice to get an idea of the structure of the msleep data.
glimpse(msleep)
## Rows: 83
## Columns: 11
## $ name <chr> "Cheetah", "Owl monkey", "Mountain beaver", "Greater shor…
## $ genus <chr> "Acinonyx", "Aotus", "Aplodontia", "Blarina", "Bos", "Bra…
## $ vore <chr> "carni", "omni", "herbi", "omni", "herbi", "herbi", "carn…
## $ order <chr> "Carnivora", "Primates", "Rodentia", "Soricomorpha", "Art…
## $ conservation <chr> "lc", NA, "nt", "lc", "domesticated", NA, "vu", NA, "dome…
## $ sleep_total <dbl> 12.1, 17.0, 14.4, 14.9, 4.0, 14.4, 8.7, 7.0, 10.1, 3.0, 5…
## $ sleep_rem <dbl> NA, 1.8, 2.4, 2.3, 0.7, 2.2, 1.4, NA, 2.9, NA, 0.6, 0.8, …
## $ sleep_cycle <dbl> NA, NA, NA, 0.1333333, 0.6666667, 0.7666667, 0.3833333, N…
## $ awake <dbl> 11.9, 7.0, 9.6, 9.1, 20.0, 9.6, 15.3, 17.0, 13.9, 21.0, 1…
## $ brainwt <dbl> NA, 0.01550, NA, 0.00029, 0.42300, NA, NA, NA, 0.07000, 0…
## $ bodywt <dbl> 50.000, 0.480, 1.350, 0.019, 600.000, 3.850, 20.490, 0.04…
2. What are the names of the variables? To get a better idea
of what the variables mean, you can use
?msleep.
names(msleep)
## [1] "name" "genus" "vore" "order" "conservation"
## [6] "sleep_total" "sleep_rem" "sleep_cycle" "awake" "brainwt"
## [11] "bodywt"
?msleep
3. Make a plot that shows the number of mammals in each vore type. Which vore type is most represented in the data?
ggplot(msleep, aes(x = vore)) +
geom_bar(mapping=aes(fill=vore)) +
labs(title = "Number of Mammals by Vore Type")+
#remove NA from the plot
scale_x_discrete(na.translate = FALSE)
## Warning: Removed 7 rows containing non-finite outside the scale range
## (`stat_count()`).
4. What is the average length of time that each mammal sleeps?
mean_sleep <- mean(msleep$sleep_total, na.rm = TRUE)
mean_sleep
## [1] 10.43373
5. Use summary() to get a quick idea of the
range of values for the variable bodywt.
summary(msleep$bodywt)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.005 0.174 1.670 166.136 41.750 6654.000
6. Build a plot that shows the range of bodywt.
by vore type. Because the range of bodywt is so variable,
you will need to scale the y-axis. You do this by adding
scale_y_log10() as another layer.
ggplot(msleep, aes(x = vore, y = bodywt)) +
geom_boxplot(mapping=aes(fill=vore)) +
scale_y_log10() +
labs(title = "Body Weight by Vore Type")
7. Is there a relationship between body weight and total sleep time? Add appropriate labels and a title to your plot.
ggplot(msleep, aes(x = bodywt, y = sleep_total)) +
geom_point() +
geom_smooth(method=lm, se=T)+
scale_x_log10() +
labs(title = "Relationship between Body Weight and Total Sleep Time",
x = "Body Weight (kg)",
y = "Total Sleep Time (hours)")
## `geom_smooth()` using formula = 'y ~ x'
8. Which vore type tends to sleep the most? Build a plot to support your answer.
ggplot(msleep, aes(x = vore, y = sleep_total)) +
geom_boxplot(mapping=aes(fill=vore)) +
labs(title = "Total Sleep Time by Vore Type",
y = "Total Sleep Time (hours)")
9. Is there a relationship between brain weight and body weight? Build a plot to support your answer.
ggplot(msleep, aes(x = bodywt, y = brainwt)) +
geom_point() +
geom_smooth(method=lm, se=T)+
scale_x_log10() +
scale_y_log10() +
labs(title = "Relationship between Body Weight and Brain Weight",
x = "Body Weight (kg)",
y = "Brain Weight (kg)")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 27 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 27 rows containing missing values or values outside the scale range
## (`geom_point()`).
10. Build one plot of your choice that provides additional insight into the msleep dataset. Be sure to include appropriate labels and a title.
ggplot(msleep, aes(x = sleep_rem, y = sleep_total)) +
geom_point(mapping=aes(color=vore)) +
geom_smooth(method=lm, se=T)+
labs(title = "Relationship between REM Sleep and Total Sleep Time",
x = "REM Sleep (hours)",
y = "Total Sleep Time (hours)")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 22 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 22 rows containing missing values or values outside the scale range
## (`geom_point()`).
Please knit your work as an .html file and upload to Canvas. Homework is due before the start of the next lab. No late work is accepted. Make sure to use the formatting conventions of RMarkdown to make your report neat and clean!