Statcast Data Manipulation in R

Setup Your Workspace

install.packages("dplyr")
library(dplyr)
setwd("~/PATH TO FOLDER WITH CSV FILE")
statcast_data <- read.csv("mlb_2020_statcast_pitcher.csv")

Filtering, Selecting, and Arranging Data

NL_ROY <- statcast_data %>% 
filter(player_name == "Devin Williams")
NL_CY <- statcast_data %>%
filter(player_name %in% c("Trevor Bauer", "Jacob deGrom", "Yu Darvish"))
new_data_frame <- old_data_frame %>%
dplyr function
ROY_pitch_info <- NL_ROY %>%
select(player_name, pitch_type, release_speed, release_spin_rate)
ROY_columns <- NL_ROY %>%
select(player_name, pitch_type:release_speed)
ROY_release_speed <- NL_ROY %>%
select(player_name, pitch_type, release_speed, release_spin_rate) %>%
arrange(desc(release_speed))
ROY_FF <- NL_ROY %>%
select(player_name, pitch_type, release_speed, release_spin_rate) %>%
filter(pitch_type == "FF") %>%
arrange(desc(release_speed))

Grouping, Summarizing, and Mutating Data

ROY_pitch_count <- NL_ROY %>%
group_by(pitch_type) %>%
summarize('pitch_count' = n())
summarize('COLUMN NAME' = FUNCTION())
ROY_avg_velos <- NL_ROY %>%
group_by(pitch_type) %>%
summarize('pitch_count' = n(),
'average_velocity' = mean(release_speed, na.rm = TRUE))
ROY_range_velos <- NL_ROY %>%
group_by(pitch_type) %>%
summarize('pitch_count' = n(),
'min_velocity' = min(release_speed, na.rm = TRUE),
'max_velocity' = max(release_speed, na.rm = TRUE))
CY_FF_ranks <- NL_CY %>%
filter(pitch_type == "FF") %>%
group_by(player_name, pitch_type) %>%
summarize('average_velocity' = mean(release_speed, na.rm = TRUE),
'average_spin' = mean(release_spin_rate, na.rm = TRUE)) %>%
mutate('bauer_units' = round(average_spin/average_velocity,1))

Wrapping It Up

--

--

--

Driveline Baseball Operations Analyst

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

SEO Or Page Rank — Which Is The More Important

Introduction to the Topos Reliable Broadcast

Response : What Should We XPRIZE Next?

The Empty Unity Scene

Design 6400 3/28 Journal

Implementing an R script in Google Cloud Run

ETHAmsterdam 2022: ETHGlobal is Back!

Playing with the Dark (language) and Gradle configuration caching

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sam Bornstein

Sam Bornstein

Driveline Baseball Operations Analyst

More from Medium

Simple Linear Regression with basic R

A Detailed Overview of Apply-Family Functions in R

https://media.istockphoto.com/photos/button-on-computer-keyboard-picture-id1149466575?k=20&m=1149466575&s=612x612&w=0&h=YTJzOt_-SGG81E21J2Fcx2j61kkuzpiUyWO_1vas7A4=

What does ~. mean in R?

All About R Programming

R Programming Language