Answer the following questions and complete the exercises in
RMarkdown. Please embed all of your code and push your final work to
your repository. Your final lab report should be organized, clean, and
run free from errors. Remember, you must remove the # for
the included code chunks to run. Be sure to add your name to the author
header above.
Make sure to use the formatting conventions of RMarkdown to make your report neat and clean!
library(tidyverse)
library(janitor)
For this homework, we will use a data set compiled by the Office of
Environment and Heritage in New South Whales, Australia. It contains the
enterococci counts in water samples obtained from Sydney beaches as part
of the Beachwatch Water Quality Program. Enterococci are
bacteria common in the intestines of mammals; they are rarely present in
clean water. So, Enterococci values are a measurement of
pollution. cfu stands for colony forming units and measures
the number of viable bacteria in a sample cfu.
This homework loosely follows the tutorial of R Ladies Sydney. If you get stuck, check it out!
Start by loading the data sydneybeaches. Do some
exploratory analysis to get an idea of the data structure.
Are these data “tidy” per the definitions of the tidyverse? How do you know? Are they in wide or long format?
We are only interested in the variables site, date, and
enterococci_cfu_100ml. Make a new object focused on these variables
only. Name the object sydneybeaches_long
Pivot the data such that the dates are column names and each
beach only appears once (wide format). Name the object
sydneybeaches_wide
Pivot the data back so that the dates are data and not column names.
We haven’t dealt much with dates yet, but separate the date into
columns day, month, and year. Do this on the
sydneybeaches_long data.
What is the average enterococci_cfu_100ml by year
for each beach. Think about which data you will use- long or
wide.
Make the output from question 7 easier to read by pivoting it to wide format.
What was the most polluted beach in 2013?
Explore the data! Do one analysis of your choice that includes a minimum of three lines of code.
Please knit your work as an .html file and upload to Canvas. Homework is due before the start of the next lab. No late work is accepted. Make sure to use the formatting conventions of RMarkdown to make your report neat and clean!