Instructions

Answer the following questions and/or complete the exercises in RMarkdown. Please embed all of your code and push the final work to your repository. Your report should be organized, clean, and run free from errors. Remember, you must remove the # for any included code chunks to run. Any plots must have appropriate titles and aesthetics.

Load the tidyverse

library("tidyverse")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.0     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.2     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.1     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
options(scipen = 999) #disable scientific notation

For this homework, we will be using the msleep dataset from the ggplot2 package. This dataset contains information about the sleep habits of various mammals. Reference: V. M. Savage and G. B. West. A quantitative, theoretical framework for understanding mammalian sleep. Proceedings of the National Academy of Sciences, 104 (3):1051-1056, 2007.

1. Use a summary function of your choice to get an idea of the structure of the msleep data.

glimpse(msleep)
## Rows: 83
## Columns: 11
## $ name         <chr> "Cheetah", "Owl monkey", "Mountain beaver", "Greater shor…
## $ genus        <chr> "Acinonyx", "Aotus", "Aplodontia", "Blarina", "Bos", "Bra…
## $ vore         <chr> "carni", "omni", "herbi", "omni", "herbi", "herbi", "carn…
## $ order        <chr> "Carnivora", "Primates", "Rodentia", "Soricomorpha", "Art…
## $ conservation <chr> "lc", NA, "nt", "lc", "domesticated", NA, "vu", NA, "dome…
## $ sleep_total  <dbl> 12.1, 17.0, 14.4, 14.9, 4.0, 14.4, 8.7, 7.0, 10.1, 3.0, 5…
## $ sleep_rem    <dbl> NA, 1.8, 2.4, 2.3, 0.7, 2.2, 1.4, NA, 2.9, NA, 0.6, 0.8, …
## $ sleep_cycle  <dbl> NA, NA, NA, 0.1333333, 0.6666667, 0.7666667, 0.3833333, N…
## $ awake        <dbl> 11.9, 7.0, 9.6, 9.1, 20.0, 9.6, 15.3, 17.0, 13.9, 21.0, 1…
## $ brainwt      <dbl> NA, 0.01550, NA, 0.00029, 0.42300, NA, NA, NA, 0.07000, 0…
## $ bodywt       <dbl> 50.000, 0.480, 1.350, 0.019, 600.000, 3.850, 20.490, 0.04…

2. What are the names of the variables? To get a better idea of what the variables mean, you can use ?msleep.

names(msleep)
##  [1] "name"         "genus"        "vore"         "order"        "conservation"
##  [6] "sleep_total"  "sleep_rem"    "sleep_cycle"  "awake"        "brainwt"     
## [11] "bodywt"
?msleep

3. Make a plot that shows the number of mammals in each vore type. Which vore type is most represented in the data?

ggplot(msleep, aes(x = vore)) +
  geom_bar(mapping=aes(fill=vore)) +
  labs(title = "Number of Mammals by Vore Type")+
  #remove NA from the plot
  scale_x_discrete(na.translate = FALSE)
## Warning: Removed 7 rows containing non-finite outside the scale range
## (`stat_count()`).

4. What is the average length of time that each mammal sleeps?

mean_sleep <- mean(msleep$sleep_total, na.rm = TRUE)
mean_sleep
## [1] 10.43373

5. Use summary() to get a quick idea of the range of values for the variable bodywt.

summary(msleep$bodywt)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##    0.005    0.174    1.670  166.136   41.750 6654.000

6. Build a plot that shows the range of bodywt. by vore type. Because the range of bodywt is so variable, you will need to scale the y-axis. You do this by adding scale_y_log10() as another layer.

ggplot(msleep, aes(x = vore, y = bodywt)) +
  geom_boxplot(mapping=aes(fill=vore)) +
  scale_y_log10() +
  labs(title = "Body Weight by Vore Type")

7. Is there a relationship between body weight and total sleep time? Add appropriate labels and a title to your plot.

ggplot(msleep, aes(x = bodywt, y = sleep_total)) +
  geom_point() +
  geom_smooth(method=lm, se=T)+
  scale_x_log10() +
  labs(title = "Relationship between Body Weight and Total Sleep Time",
       x = "Body Weight (kg)",
       y = "Total Sleep Time (hours)")
## `geom_smooth()` using formula = 'y ~ x'

8. Which vore type tends to sleep the most? Build a plot to support your answer.

ggplot(msleep, aes(x = vore, y = sleep_total)) +
  geom_boxplot(mapping=aes(fill=vore)) +
  labs(title = "Total Sleep Time by Vore Type",
       y = "Total Sleep Time (hours)")

9. Is there a relationship between brain weight and body weight? Build a plot to support your answer.

ggplot(msleep, aes(x = bodywt, y = brainwt)) +
  geom_point() +
  geom_smooth(method=lm, se=T)+
  scale_x_log10() +
  scale_y_log10() +
  labs(title = "Relationship between Body Weight and Brain Weight",
       x = "Body Weight (kg)",
       y = "Brain Weight (kg)")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 27 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 27 rows containing missing values or values outside the scale range
## (`geom_point()`).

10. Build one plot of your choice that provides additional insight into the msleep dataset. Be sure to include appropriate labels and a title.

ggplot(msleep, aes(x = sleep_rem, y = sleep_total)) +
  geom_point(mapping=aes(color=vore)) +
  geom_smooth(method=lm, se=T)+
  labs(title = "Relationship between REM Sleep and Total Sleep Time",
       x = "REM Sleep (hours)",
       y = "Total Sleep Time (hours)")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 22 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 22 rows containing missing values or values outside the scale range
## (`geom_point()`).

Knit and Upload

Please knit your work as an .html file and upload to Canvas. Homework is due before the start of the next lab. No late work is accepted. Make sure to use the formatting conventions of RMarkdown to make your report neat and clean!