Answer the following questions and/or complete the exercises in
RMarkdown. Please embed all of your code and push the final work to your
repository. Your report should be organized, clean, and run free from
errors. Remember, you must remove the #
for any included
code chunks to run.
library("tidyverse")
library("janitor")
Let’s have a little fun with this one! We are going to explore data on superheroes. These are data taken from comic books and assembled by devoted fans. The include a good mix of categorical and continuous data. Data taken from: https://www.kaggle.com/claudiodavi/superhero-set
Load the heroes_information.csv
and
super_hero_powers.csv
data. Make sure the columns are
cleanly named.
superhero_info <- read_csv("data/heroes_information.csv", na = c("", "-99", "-")) %>% clean_names()
## Rows: 734 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (8): name, Gender, Eye color, Race, Hair color, Publisher, Skin color, A...
## dbl (2): Height, Weight
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
superhero_powers <- read_csv("data/super_hero_powers.csv", na = c("", "-99", "-")) %>% clean_names()
## Rows: 667 Columns: 168
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): hero_names
## lgl (167): Agility, Accelerated Healing, Lantern Power Ring, Dimensional Awa...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
table(superhero_info$alignment)
##
## bad good neutral
## 207 496 24
superhero_info %>%
count(alignment)
## # A tibble: 4 × 2
## alignment n
## <chr> <int>
## 1 bad 207
## 2 good 496
## 3 neutral 24
## 4 <NA> 7
tabyl(superhero_info, alignment)
## alignment n percent valid_percent
## bad 207 0.282016349 0.28473177
## good 496 0.675749319 0.68225585
## neutral 24 0.032697548 0.03301238
## <NA> 7 0.009536785 NA
bad <- superhero_info %>%
filter(alignment=="bad")
bad$name
## [1] "Abomination" "Abraxas" "Absorbing Man"
## [4] "Air-Walker" "Ajax" "Alex Mercer"
## [7] "Alien" "Amazo" "Ammo"
## [10] "Angela" "Annihilus" "Anti-Monitor"
## [13] "Anti-Spawn" "Apocalypse" "Arclight"
## [16] "Atlas" "Azazel" "Bane"
## [19] "Beetle" "Big Barda" "Big Man"
## [22] "Billy Kincaid" "Bird-Man" "Bird-Man II"
## [25] "Black Abbott" "Black Adam" "Black Mamba"
## [28] "Black Manta" "Blackout" "Blackwing"
## [31] "Blizzard" "Blizzard" "Blizzard II"
## [34] "Blob" "Bloodaxe" "Bloodwraith"
## [37] "Boba Fett" "Bomb Queen" "Brainiac"
## [40] "Bullseye" "Callisto" "Carnage"
## [43] "Chameleon" "Changeling" "Cheetah"
## [46] "Cheetah II" "Cheetah III" "Chromos"
## [49] "Clock King" "Cogliostro" "Cottonmouth"
## [52] "Curse" "Cy-Gor" "Cyborg Superman"
## [55] "Darkseid" "Darkside" "Darth Maul"
## [58] "Darth Vader" "Deadshot" "Demogoblin"
## [61] "Destroyer" "Diamondback" "Doctor Doom"
## [64] "Doctor Doom II" "Doctor Octopus" "Doomsday"
## [67] "Doppelganger" "Dormammu" "Ego"
## [70] "Electro" "Elle Bishop" "Evil Deadpool"
## [73] "Evilhawk" "Exodus" "Fabian Cortez"
## [76] "Fallen One II" "Faora" "Fixer"
## [79] "Frenzy" "General Zod" "Giganta"
## [82] "Goblin Queen" "Godzilla" "Gog"
## [85] "Gorilla Grodd" "Granny Goodness" "Greedo"
## [88] "Green Goblin" "Green Goblin II" "Harley Quinn"
## [91] "Heat Wave" "Hela" "Hobgoblin"
## [94] "Hydro-Man" "Iron Monger" "Jigsaw"
## [97] "Joker" "Junkpile" "Kang"
## [100] "Killer Croc" "Killer Frost" "King Shark"
## [103] "Kingpin" "Klaw" "Kraven II"
## [106] "Kraven the Hunter" "Kylo Ren" "Lady Bullseye"
## [109] "Lady Deathstrike" "Leader" "Lex Luthor"
## [112] "Lightning Lord" "Living Brain" "Lizard"
## [115] "Loki" "Luke Campbell" "Mach-IV"
## [118] "Magneto" "Magus" "Mandarin"
## [121] "Match" "Maxima" "Mephisto"
## [124] "Metallo" "Mister Freeze" "Mister Knife"
## [127] "Mister Mxyzptlk" "Mister Sinister" "Mister Zsasz"
## [130] "MODOK" "Moloch" "Molten Man"
## [133] "Moonstone" "Morlun" "Moses Magnum"
## [136] "Mysterio" "Mystique" "Nebula"
## [139] "Omega Red" "Onslaught" "Overtkill"
## [142] "Ozymandias" "Parademon" "Penguin"
## [145] "Plantman" "Plastique" "Poison Ivy"
## [148] "Predator" "Professor Zoom" "Proto-Goblin"
## [151] "Purple Man" "Pyro" "Ra's Al Ghul"
## [154] "Razor-Fist II" "Red Mist" "Red Skull"
## [157] "Redeemer II" "Redeemer III" "Rhino"
## [160] "Rick Flag" "Riddler" "Sabretooth"
## [163] "Sauron" "Scarecrow" "Scarlet Witch"
## [166] "Scorpia" "Scorpion" "Sebastian Shaw"
## [169] "Shocker" "Siren" "Siren II"
## [172] "Siryn" "Snake-Eyes" "Solomon Grundy"
## [175] "Spider-Carnage" "Spider-Woman IV" "Steppenwolf"
## [178] "Stormtrooper" "Superboy-Prime" "Swamp Thing"
## [181] "Swarm" "Sylar" "T-1000"
## [184] "T-800" "T-850" "T-X"
## [187] "Taskmaster" "Thanos" "Tiger Shark"
## [190] "Tinkerer" "Trigon" "Two-Face"
## [193] "Ultron" "Utgard-Loki" "Vanisher"
## [196] "Vegeta" "Venom" "Venom II"
## [199] "Venom III" "Violator" "Vulture"
## [202] "Walrus" "Warp" "Weapon XI"
## [205] "White Canary" "Yellow Claw" "Zoom"
superhero_info
?superhero_info %>%
select(race) %>%
n_distinct()
## [1] 62
superhero_info %>%
group_by(race) %>%
summarize(n=n()) %>%
arrange(-n)
## # A tibble: 62 × 2
## race n
## <chr> <int>
## 1 <NA> 304
## 2 Human 208
## 3 Mutant 63
## 4 God / Eternal 14
## 5 Cyborg 11
## 6 Human / Radiation 11
## 7 Android 9
## 8 Symbiote 9
## 9 Alien 7
## 10 Kryptonian 7
## # ℹ 52 more rows
good_guys <-
superhero_info %>%
filter(alignment=="good")
bad_guys <-
superhero_info %>%
filter(alignment=="bad")
good_guys %>% filter(race=="Vampire")
## # A tibble: 2 × 10
## name gender eye_color race hair_color height publisher skin_color alignment
## <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 Angel Male <NA> Vampi… <NA> NA Dark Hor… <NA> good
## 2 Blade Male brown Vampi… Black 188 Marvel C… <NA> good
## # ℹ 1 more variable: weight <dbl>
bad_guys %>%
select(name, height) %>%
mutate(height_meters=height*0.0254) %>%
arrange(-height_meters)
## # A tibble: 207 × 3
## name height height_meters
## <chr> <dbl> <dbl>
## 1 MODOK 366 9.30
## 2 Onslaught 305 7.75
## 3 Sauron 279 7.09
## 4 Solomon Grundy 279 7.09
## 5 Darkseid 267 6.78
## 6 Amazo 257 6.53
## 7 Alien 244 6.20
## 8 Doomsday 244 6.20
## 9 Killer Croc 244 6.20
## 10 Venom III 229 5.82
## # ℹ 197 more rows
good_guys %>%
select(name, height) %>%
mutate(height_meters=height*0.0254) %>%
arrange(-height_meters)
## # A tibble: 496 × 3
## name height height_meters
## <chr> <dbl> <dbl>
## 1 Fin Fang Foom 975 24.8
## 2 Groot 701 17.8
## 3 Wolfsbane 366 9.30
## 4 Sasquatch 305 7.75
## 5 Ymir 305. 7.74
## 6 Rey 297 7.54
## 7 Hellboy 259 6.58
## 8 Hulk 244 6.20
## 9 Kilowog 234 5.94
## 10 Cloak 226 5.74
## # ℹ 486 more rows
superhero_powers
Have a quick look at the superhero_powers
data
frame.
superhero_powers %>%
filter(agility & stealth & super_strength & stamina) %>%
select(hero_names, agility, stealth, super_strength, stamina)
## # A tibble: 40 × 5
## hero_names agility stealth super_strength stamina
## <chr> <lgl> <lgl> <lgl> <lgl>
## 1 Alex Mercer TRUE TRUE TRUE TRUE
## 2 Angel TRUE TRUE TRUE TRUE
## 3 Ant-Man II TRUE TRUE TRUE TRUE
## 4 Aquaman TRUE TRUE TRUE TRUE
## 5 Batman TRUE TRUE TRUE TRUE
## 6 Black Flash TRUE TRUE TRUE TRUE
## 7 Black Manta TRUE TRUE TRUE TRUE
## 8 Brundlefly TRUE TRUE TRUE TRUE
## 9 Buffy TRUE TRUE TRUE TRUE
## 10 Cable TRUE TRUE TRUE TRUE
## # ℹ 30 more rows
superhero_powers %>%
select(hero_names, agility, stealth, super_strength, stamina) %>%
filter(agility==TRUE & stealth==TRUE & super_strength==TRUE & stamina==TRUE)
## # A tibble: 40 × 5
## hero_names agility stealth super_strength stamina
## <chr> <lgl> <lgl> <lgl> <lgl>
## 1 Alex Mercer TRUE TRUE TRUE TRUE
## 2 Angel TRUE TRUE TRUE TRUE
## 3 Ant-Man II TRUE TRUE TRUE TRUE
## 4 Aquaman TRUE TRUE TRUE TRUE
## 5 Batman TRUE TRUE TRUE TRUE
## 6 Black Flash TRUE TRUE TRUE TRUE
## 7 Black Manta TRUE TRUE TRUE TRUE
## 8 Brundlefly TRUE TRUE TRUE TRUE
## 9 Buffy TRUE TRUE TRUE TRUE
## 10 Cable TRUE TRUE TRUE TRUE
## # ℹ 30 more rows
superhero_powers %>%
mutate(across(-1, ~ ifelse(. == TRUE, 1, 0))) %>%
mutate(total_powers = rowSums(across(-1))) %>%
select(hero_names, total_powers) %>%
arrange(-total_powers)
## # A tibble: 667 × 2
## hero_names total_powers
## <chr> <dbl>
## 1 Spectre 49
## 2 Amazo 44
## 3 Living Tribunal 35
## 4 Martian Manhunter 35
## 5 Man of Miracles 34
## 6 Captain Marvel 33
## 7 T-X 33
## 8 Galactus 32
## 9 T-1000 32
## 10 Mister Mxyzptlk 31
## # ℹ 657 more rows
superhero_powers %>%
# Start with the `superhero_powers` data frame and pipe it into the next step.
mutate(across(-1, ~ ifelse(. == TRUE, 1, 0))) %>%
# Transform all columns except the first one (`-1`) using `across`.
# For each value in those columns, replace `TRUE` with 1 and all other values (e.g., `FALSE`) with 0.
mutate(total_powers = rowSums(across(-1))) %>%
# Create a new column, `total_powers`, that sums up the values row-wise across all columns except the first one (`-1`).
select(hero_names, total_powers) %>%
# Keep only the `hero_names` column (assumed to be the first column) and the newly created `total_powers` column.
arrange(-total_powers)
## # A tibble: 667 × 2
## hero_names total_powers
## <chr> <dbl>
## 1 Spectre 49
## 2 Amazo 44
## 3 Living Tribunal 35
## 4 Martian Manhunter 35
## 5 Man of Miracles 34
## 6 Captain Marvel 33
## 7 T-X 33
## 8 Galactus 32
## 9 T-1000 32
## 10 Mister Mxyzptlk 31
## # ℹ 657 more rows
# Arrange (sort) the data frame in descending order of `total_powers` (from highest to lowest).
superhero_powers %>%
filter(hero_names == "Darth Vader") %>%
select_if(all) # Selects all columns where all values are TRUE
## Warning in .p(column, ...): coercing argument of type 'character' to logical
## # A tibble: 1 × 26
## agility accelerated_healing durability stealth danger_sense marksmanship
## <lgl> <lgl> <lgl> <lgl> <lgl> <lgl>
## 1 TRUE TRUE TRUE TRUE TRUE TRUE
## # ℹ 20 more variables: weapons_master <lgl>, intelligence <lgl>,
## # telepathy <lgl>, energy_blasts <lgl>, super_speed <lgl>,
## # electrokinesis <lgl>, enhanced_senses <lgl>, telekinesis <lgl>, jump <lgl>,
## # astral_projection <lgl>, reflexes <lgl>, force_fields <lgl>,
## # psionic_powers <lgl>, precognition <lgl>, enhanced_hearing <lgl>,
## # hypnokinesis <lgl>, light_control <lgl>, illusions <lgl>, cloaking <lgl>,
## # the_force <lgl>
superhero_info %>%
filter(name=="Darth Vader")
## # A tibble: 1 × 10
## name gender eye_color race hair_color height publisher skin_color alignment
## <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 Darth… Male yellow Cybo… No Hair 198 George L… <NA> bad
## # ℹ 1 more variable: weight <dbl>
Please knit your work as a .pdf or .html file and upload to Canvas. Homework is due before the start of the next lab. No late work is accepted. Make sure to use the formatting conventions of RMarkdown to make your report neat and clean!