To make a box plot we use the geom_boxplot() function! In the following example, we will look at the distribution of the final positions of the McLaren 2023 season for both of their drivers. To set up this histogram, I construct a new variable called, final_position, and save it as a new object called McLarenStandings_2023.
#| include: falselibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RandomData)McLarenStandings_2023 <- race_stats |>select(circuit, year, constructor, surname) |># remove duplicatesunique() |>filter(constructor =="McLaren"& year ==2023) |>mutate(final_position =case_when(#PIASTRI circuit =="Bahrain International Circuit"& surname =="Piastri"~"DNF", circuit =="Jeddah Corniche Circuit"& surname =="Piastri"~"15", circuit =="Albert Park Grand Prix Circuit"& surname =="Piastri"~"8", circuit =="Baku City Circuit"& surname =="Piastri"~"11", circuit =="Miami International Autodrome"& surname =="Piastri"~"19", circuit =="Circuit de Monaco"& surname =="Piastri"~"10", circuit =="Circuit de Barcelona-Catalunya"& surname =="Piastri"~"13", circuit =="Circuit Gilles Villeneuve"& surname =="Piastri"~"11", circuit =="Red Bull Ring"& surname =="Piastri"~"16", circuit =="Silverstone Circuit"& surname =="Piastri"~"4", circuit =="Hungaroring"& surname =="Piastri"~"5", circuit =="Circuit de Spa-Francorchamps"& surname =="Piastri"~"DNF", circuit =="Circuit Park Zandvoort"& surname =="Piastri"~"9", circuit =="Autodromo Nazionale di Monza"& surname =="Piastri"~"12", circuit =="Marina Bay Street Circuit"& surname =="Piastri"~"7", circuit =="Suzuka Circuit"& surname =="Piastri"~"3", circuit =="Losail International Circuit"& surname =="Piastri"~"2", circuit =="Circuit of the Americas"& surname =="Piastri"~"DNF", circuit =="Autódromo Hermanos Rodríguez"& surname =="Piastri"~"8", circuit =="Autódromo José Carlos Pace"~"14", circuit =="Las Vegas Strip Street Circuit"& surname =="Piastri"~"10", circuit =="Yas Marina Circuit"& surname =="Piastri"~"6",# NORRIS circuit =="Bahrain International Circuit"& surname =="Norris"~"17", circuit =="Jeddah Corniche Circuit"& surname =="Norris"~"17", circuit =="Albert Park Grand Prix Circuit"& surname =="Norris"~"6", circuit =="Baku City Circuit"& surname =="Norris"~"9", circuit =="Miami International Autodrome"& surname =="Norris"~"17", circuit =="Circuit de Monaco"& surname =="Norris"~"9", circuit =="Circuit de Barcelona-Catalunya"& surname =="Norris"~"17", circuit =="Circuit Gilles Villeneuve"& surname =="Norris"~"13", circuit =="Red Bull Ring"& surname =="Norris"~"4", circuit =="Silverstone Circuit"& surname =="Norris"~"2", circuit =="Hungaroring"& surname =="Norris"~"2", circuit =="Circuit de Spa-Francorchamps"& surname =="Norris"~"7", circuit =="Circuit Park Zandvoort"& surname =="Norris"~"9", circuit =="Autodromo Nazionale di Monza"& surname =="Norris"~"8", circuit =="Marina Bay Street Circuit"& surname =="Norris"~"2", circuit =="Suzuka Circuit"& surname =="Norris"~"2", circuit =="Losail International Circuit"& surname =="Norris"~"3", circuit =="Circuit of the Americas"& surname =="Norris"~"3", circuit =="Autódromo Hermanos Rodríguez"& surname =="Norris"~"5", circuit =="Autódromo José Carlos Pace"~"2", circuit =="Las Vegas Strip Street Circuit"& surname =="Norris"~"DNF", circuit =="Yas Marina Circuit"& surname =="Norris"~"5" ) ) |>mutate(final_position_numeric =as.numeric(final_position))
ggplot(McLarenStandings_2023, aes(x = surname, y = final_position_numeric, fill = surname)) +geom_boxplot() +scale_fill_manual(values =c("orange", "grey")) +# Assign McLaren colorslabs(title ="McLaren Race Results", x ="Driver", y ="Final Position") +theme_minimal()
aes(x = surname, y = final_position_numeric, fill = surname):
x = surname: Drivers on the x-axis.
y = final_position_numeric: Final race position on the y-axis.
fill = surname: Colors the boxes based on the driver.
geom_boxplot(): Creates a boxplot to show the distribution of race positions.
The box and whiskey plot represents the middle 50% of race finishes for each driver, while the horizontal line inside the box is the median race position, the whiskers show the range of most race finishes (excluding outliers), and any dots outside the whiskers indicate outliers.