Now that we know some basic ways to manipulate the data frame, lets look at different way to do basic descriptive statistics! In this section we will be using the function, summarize(). This function is similar to the mutate function, except instead of adding a variable, it makes a new data frame based on existing variables. You will also see the function group_by(). This function allows us to organize the data by telling it to group things by a variable(s). Essentially, the functions splits things into groups.
For this example we are going to find the total points for each team in the 2023 season!
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# A tibble: 10 × 2
constructor total_points
<chr> <dbl>
1 Alfa Romeo 16
2 AlphaTauri 22
3 Alpine F1 Team 110
4 Aston Martin 266
5 Ferrari 363
6 Haas F1 Team 9
7 McLaren 266
8 Mercedes 374
9 Red Bull 790
10 Williams 26
# A tibble: 10 × 2
constructor total_points
<chr> <dbl>
1 Red Bull 790
2 Mercedes 374
3 Ferrari 363
4 Aston Martin 266
5 McLaren 266
6 Alpine F1 Team 110
7 Williams 26
8 AlphaTauri 22
9 Alfa Romeo 16
10 Haas F1 Team 9
What if we wanted to know the percentage of points each driver contributed to the teams total?
TeamStandings_2023 <- race_stats |>select(circuit, year, constructor, surname, points) |># remove duplicatesunique() |>filter(year==2023) |>group_by(constructor) |>mutate(total_points =sum(points, na.rm =TRUE)) |>ungroup() |># Ungroup to avoid issues with the next group_bygroup_by(surname) |>summarize(perc_points =sum(points, na.rm =TRUE) /unique(total_points) *100) |>arrange(desc(perc_points))print(TeamStandings_2023)