I need your help!

I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.

You can leave a comment at the bottom of the page/chapter, or open an issue or submit a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook

Alternatively, you can leave an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.

24 Simulation: Bootstrapping and the Monte Carlo Method

This chapter provides an overview of various approaches to simulation, including bootstrapping and the Monte Carlo method.

24.1 Getting Started

24.1.1 Load Packages

Code

library("ffanalytics")
library("data.table")
library("future")
library("future.apply")
library("progressr")
library("SimDesign")
library("fitdistrplus")
library("sn")
library("tidyverse")

24.1.2 Load Data

Code

load(file = "./data/players_projectedPoints_seasonal.RData")
load(file = "./data/nfl_playerIDs.RData")

24.2 Overview

A simulation is an “imitative representation” of a phenomenon that could exist the real world. In statistics, simulations are computer-driven investigations to better understand a phenomenon by studying its behavior under different conditions. For instance, we might want to determine the likely range of outcomes for a player, in terms of the range of fantasy points that a player might score over the course of a season. Simulations can be conducted in various ways. Two common ways of conducting simulations are via bootstrapping and via Monte Carlo simulation.

24.2.1 Bootstrapping

Bootstrapping involves repeated resampling (with replacement) from observed data. For instance, if we have 100 sources provide projections for a player, we could estimate the most likely range of fantasy points for the player by repeatedly sampling from the 100 projections.

24.2.2 Monte Carlo Simulation

Monte Carlo simulation involves repeated random sampling from a known distribution (Sigal & Chalmers, 2016). For instance, if we know the population distribution for the likely outcomes for a player—e.g., a normal distribution with a known mean (e.g., 150 points) and standard deviation (e.g., 20 points)—we can repeatedly sample randomly from this distribution. The distribution could be, as examples, a normal distribution, a log-normal distribution, a binomial distribution, a chi-square distribution, etc. (Sigal & Chalmers, 2016). The distribution provides a probability density function, which indicates the probability that any particular value would be observed if the data arose from that distribution.

24.3 Simulation of Projected Statistics and Points

Below, we perform bootstrapping and Monte Carlo simulations of projected statistics and points. However, it is worth noting that—as for any simulation—the quality of the results depend on the quality of the inputs. In this case, the quality of the simulation depends on the quality of the projections. If the projections are no good, the simulation results will not be trustworthy. Garbage in, garbage out. As we evaluated in Section 17.12, projections tend to show moderate accuracy for fantasy performance, but they are not highly accurate. Thus, we should treat simulation results arising from fantasy projections with a good dose of skepticism.

24.3.1 Bootstrapping

24.3.1.1 Prepare Data

Code

all_proj <- dplyr::bind_rows(players_projectedPoints_seasonal)

Code

vars_by_pos <- list(
  QB = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "fumbles_lost", "fumbles_total", "two_pts",
    "sacks",
    "pass_09_tds", "pass_1019_tds", "pass_2029_tds", "pass_3039_tds", "pass_4049_tds", "pass_50_tds",
    "pass_40_yds", "pass_250_yds", "pass_300_yds", "pass_350_yds", "pass_400_yds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds"
    ),
  RB = c(
    "games",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  WR = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  TE = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  K = c(
    "fg_0019", "fg_2029", "fg_3039", "fg_4049", "fg_50", "fg_50_att",
    "fg_39", "fg_att_39", "fg_49", "fg_49_att",
    "fg", "fg_att", "fg_miss", "xp", "xp_att"
  ),
  D = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DL = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  LB = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DB = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DST = c(
    "dst_fum_recvr", "dst_fum_rec", "dst_int", "dst_safety", "dst_sacks", "dst_td", "dst_blk",
    "dst_fumbles", "dst_tackles", "dst_yds_against", "dst_pts_against", "dst_pts_allowed", "dst_ret_yds"
  )
)

24.3.1.2 Bootstrapping Function

For performing the bootstrapping, we leverage the data.table (Barrett et al., 2025), future (Bengtsson, 2025a), and future.apply (Bengtsson, 2025b) packages for speed (by using parallel processing) and memory efficiency. We use the progressr (Bengtsson, 2024) package to create a progress bar.

Code

bootstrapSimulation <- function(
    projectedStats,
    vars_by_pos,
    n_iter = 10000,
    seed = NULL,
    progress = TRUE) {
  
  dt <- data.table::as.data.table(projectedStats) # use data.table for speed
  all_ids <- unique(dt$id)
  
  if (!is.null(seed)) set.seed(seed)
  
  future::plan(future::multisession) # parallelize tasks across multiple background R sessions using multiple cores to speed up simulation
  
  if (progress) progressr::handlers("txtprogressbar") # specify progress-bar style
  
  results <- progressr::with_progress({ # wrap in with_progress for progress bar
    p <- if (progress) progressr::progressor(along = all_ids) else NULL # create progressor for progress bar
    
    future.apply::future_lapply(
      all_ids, # apply the function below to each player using a parallelized loop
      function(player_id) {
        if (!is.null(p)) p() # advance progress bar
        
        player_data <- dt[id == player_id]
        player_pos  <- unique(player_data$pos)
        
        if (length(player_pos) != 1 || !player_pos %in% names(vars_by_pos)) return(NULL)
        
        stat_vars <- vars_by_pos[[player_pos]] # pull the relevant stat variables to simulate for this player's position
        out <- data.table(iteration = seq_len(n_iter), id = player_id, pos = player_pos)
        
        for (var in stat_vars) { # loop over each stat variable that should be simulated for the player's position
          if (var %in% names(player_data)) {
            non_na_values <- player_data[[var]][!is.na(player_data[[var]])] # pull non-missing values of the stat for the player (from all projection sources)
            
            if (length(non_na_values) > 0) {
              out[[var]] <- sample(non_na_values, n_iter, replace = TRUE) # if there are valid values, sample with replacement to simulate n_iter values
            } else {
              out[[var]] <- NA_real_ # specify a numeric missing value (if all values were missing)
            }
          } else {
            out[[var]] <- NA_real_ # specify a numeric missing value (if the stat variable doesn't exist)
          }
        }
        
        return(out)
      },
      future.seed = TRUE # ensures that each parallel process gets a reproducible random seed
    )
  })
  
  data.table::rbindlist(results, use.names = TRUE, fill = TRUE) # combines all the individual player results into one large data table, aligning columns by name; fill = TRUE ensures that missing columns are filled with NA where necessary
}

24.3.1.3 Run the Bootstrapping Simulation

Code

bootstappedStats <- bootstrapSimulation(
  projectedStats = all_proj,
  vars_by_pos = vars_by_pos,
  n_iter = 5000,
  seed = 52242)

24.3.1.4 Score Fantasy Points from the Simulation

Code

data.table::setnames(bootstappedStats, "iteration", "data_src") # data.table equivalent to: bootstappedStats$data_src <- bootstappedStats$iteration

bootstappedStatsByPosition <- split(
  bootstappedStats,
  by = "pos",
  keep.by = TRUE)

Code

bootstappedStatsByPosition <- lapply(
  bootstappedStatsByPosition,
  setDF)

attr(bootstappedStatsByPosition, "season") <- 2024
attr(bootstappedStatsByPosition, "week") <- 0

Code

bootstrappedFantasyPoints <- ffanalytics:::source_points(
  data_result = bootstappedStatsByPosition,
  scoring_rules = ffanalytics::scoring)

Code

bootstrappedFantasyPoints$iteration <- bootstrappedFantasyPoints$data_src
bootstrappedFantasyPoints$data_src <- NULL

bootstrappedFantasyPoints <- bootstrappedFantasyPoints %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","team")],
    by = c("id" = "mfl_id")
  )

bootstrappedFantasyPoints <- bootstrappedFantasyPoints %>% 
  rename(projectedPoints = raw_points)

24.3.1.5 Summarize Players’ Distribution of Projected Fantasy Points

Code

bootstrappedFantasyPoints_summary <- bootstrappedFantasyPoints %>% 
  group_by(id) %>% 
  summarise(
    mean = mean(projectedPoints, na.rm = TRUE),
    SD = sd(projectedPoints, na.rm = TRUE),
    min = min(projectedPoints, na.rm = TRUE),
    max = max(projectedPoints, na.rm = TRUE),
    q10 = quantile(projectedPoints, .10, na.rm = TRUE), # 10th quantile
    q90 = quantile(projectedPoints, .90, na.rm = TRUE), # 90th quantile
    range = max(projectedPoints, na.rm = TRUE) - min(projectedPoints, na.rm = TRUE),
    IQR = IQR(projectedPoints, na.rm = TRUE),
    MAD = mad(projectedPoints, na.rm = TRUE),
    CV = SD/mean,
    median = median(projectedPoints, na.rm = TRUE),
    pseudomedian = DescTools::HodgesLehmann(projectedPoints, na.rm = TRUE),
    mode = petersenlab::Mode(projectedPoints, multipleModes = "mean"),
    skewness = psych::skew(projectedPoints, na.rm = TRUE),
    kurtosis = psych::kurtosi(projectedPoints, na.rm = TRUE)
  )

24.3.1.6 View Players’ Distribution of Projected Fantasy Points

Code

bootstrappedFantasyPoints_summary <- bootstrappedFantasyPoints_summary %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","team","position")],
    by = c("id" = "mfl_id")
  ) %>% 
  select(name, team, position, mean:kurtosis, everything()) %>% 
  arrange(-mean)

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position == "QB") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position == "RB") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position == "WR") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position == "TE") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position == c("K","PK")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("DL","DT","DE")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("LB","MLB","OLB")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

Code

bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("DB","S","CB")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

An example distribution of projected fantasy points is in Figure 24.1.

Code

ggplot2::ggplot(
  data = bootstrappedFantasyPoints %>%
    filter(pos == "QB" & name == "Patrick Mahomes"),
  mapping = aes(
    x = projectedPoints)
) +
  geom_histogram(
    aes(y = after_stat(density)),
    color = "#000000",
    fill = "#0099F8"
  ) +
  geom_density(
    color = "#000000",
    fill = "#F85700",
    alpha = 0.6 # add transparency
  ) +
  geom_rug() +
  #coord_cartesian(
  #  xlim = c(0,400)) +
  labs(
    x = "Fantasy Points",
    y = "Density",
    title = "Distribution of Projected Fantasy Points for Patrick Mahomes"
  ) +
  theme_classic() +
  theme(axis.title.y = element_text(angle = 0, vjust = 0.5)) # horizontal y-axis title

Figure 24.1: Distribution of Projected Fantasy Points for Patrick Mahomes from Bootstrapping.

Projections of two players—one with relatively narrow uncertainty and one with relatively wide uncertainty—are depicted in Figure 24.2.

Code

ggplot2::ggplot(
  data = bootstrappedFantasyPoints %>%
    filter(pos == "QB" & (name %in% c("Dak Prescott", "Drake Maye"))),
  mapping = aes(
    x = projectedPoints,
    group = name,
    #color = name,
    fill = name)
) +
  geom_histogram(
    aes(y = after_stat(density))
  ) +
  geom_density(
    alpha = 0.6 # add transparency
  ) +
  coord_cartesian(
    xlim = c(0,NA),
    expand = FALSE) +
  #geom_rug() +
  labs(
    x = "Fantasy Points",
    y = "Density",
    fill = "",
    color = "",
    title = "Distribution of Projected Fantasy Points"
  ) +
  theme_classic() +
  theme(axis.title.y = element_text(angle = 0, vjust = 0.5)) # horizontal y-axis title

Figure 24.2: Distribution of Projected Fantasy Points for Two Players from Bootstrapping. There is relatively narrow uncertainty around projected fantasy points for Dak Prescott, whereas there is relatively wide uncertainty around the projected fantasy points for Drake Maye.

24.3.2 Monte Carlo Simulation

24.3.2.1 `SimDesign` Package

You can generate a template for Monte Carlo simulations in the SimDesign (Chalmers, 2025; Chalmers & Adkins, 2020) package using the following code:

Code

SimDesign::SimFunctions()

#-------------------------------------------------------------------

library(SimDesign)

Design <- createDesign(factor1 = NA,
                       factor2 = NA)

#-------------------------------------------------------------------

Generate <- function(condition, fixed_objects) {
    dat <- data.frame()
    dat
}

Analyse <- function(condition, dat, fixed_objects) {
    ret <- nc(stat1 = NaN, stat2 = NaN)
    ret
}

Summarise <- function(condition, results, fixed_objects) {
    ret <- c(bias = NaN, RMSE = NaN)
    ret
}

#-------------------------------------------------------------------

res <- runSimulation(design=Design, replications=2, generate=Generate, 
                     analyse=Analyse, summarise=Summarise)
res

24.3.2.2 Prepare Data

Code

all_proj <- all_proj %>% 
  rename(projectedPoints = raw_points)

24.3.2.3 Optimal Distribution for Each Player

For each player, we identify the optimal distribution as either a normal distribution , or as a skew-normal distribution. The normal distribution was fit using the fitdist() function of the fitdistrplus package (Delignette-Muller & Dutang, 2015; Delignette-Muller et al., 2025). The skew-normal distribution was fit using the selm() function of the sn package (A. Azzalini, 2023; A. A. Azzalini, 2023).

Code

# Function to identify the "best" distribution (Normal vs Skew‑Normal) for every player (tries both families and picks by AIC; uses empirical distribution if fewer than 2 unique scores)
fit_best <- function(x) {
  # Basic checks
  if (length(unique(x)) < 2 || all(is.na(x))) { # Use empirical distribution if there are fewer than 2 unique scores
    return(list(type = "empirical", empirical = x))
  }

  # Try Normal Distribution
  fit_norm <- tryCatch(
    fitdistrplus::fitdist(x, distr = "norm"),
    error = function(e) NULL
  )

  # Try Skew-Normal Distribution
  fit_skew <- tryCatch(
    sn::selm(x ~ 1),
    error = function(e) NULL
  )

  # Handle bad fits: sd = NA, etc.
  if (!is.null(fit_norm) && any(is.na(fit_norm$estimate))) {
    fit_norm <- NULL
  }

  if (!is.null(fit_skew)) {
    pars <- tryCatch(sn::coef(fit_skew, param.type = "dp"), error = function(e) NULL)
    if (is.null(pars) || any(is.na(pars))) {
      fit_skew <- NULL
    }
  }

  # Choose best available
  if (!is.null(fit_norm) && !is.null(fit_skew)) {
    aic_norm <- AIC(fit_norm)
    aic_skew <- AIC(fit_skew)
    if (aic_skew + 2 < aic_norm) { # skew-normal is more complex (has more parameters) than normal distribution, so only select a skew-normal distribution if it fits substantially better than a normal distribution
      pars <- sn::coef(fit_skew, param.type = "dp")
      return(list(
        type  = "skewnorm",
        xi    = pars["dp.location"],
        omega = pars["dp.scale"],
        alpha = pars["dp.shape"]))
    } else {
      return(list(
        type = "norm",
        mean = fit_norm$estimate["mean"],
        sd   = fit_norm$estimate["sd"]))
    }
  } else if (!is.null(fit_norm)) {
    return(list(
      type = "norm",
      mean = fit_norm$estimate["mean"],
      sd   = fit_norm$estimate["sd"]))
  } else {
    return(list(type = "empirical", empirical = x))
  }
}

Code

proj_dists_tbl <- all_proj %>%
  filter(!is.na(id) & id != "") %>%
  group_by(id) %>%
  summarise(
    dist_info = list(fit_best(projectedPoints)),
    n_proj = n(), # record of how many sources they have
    .groups = "drop"
  )

proj_dists <- proj_dists_tbl %>%
  filter(!is.na(id) & id != "") %>%
  distinct(id, .keep_all = TRUE) %>%
  (\(x) setNames(x$dist_info, x$id))()

Code

proj_dists_tbl %>%
  dplyr::mutate(
    dist_type = purrr::map_chr(dist_info, ~ .x$type)
  ) %>%
  dplyr::count(dist_type)

24.3.2.4 `SimDesign` Step 1: Design Grid

Now we build the SimDesign design grid based on the number of projections that each player had.

Code

Design <- proj_dists_tbl %>%
  dplyr::mutate(
    id,
    n_sources = n_proj,
    .keep = "none"
  )

Code

missing_ids <- setdiff(Design$id, names(proj_dists))
length(missing_ids) # should be 0

any(is.na(proj_dists_tbl$id)) # should be FALSE
any(is.na(Design$id))         # should be FALSE
any(is.na(names(proj_dists))) # should be FALSE

24.3.2.5 `SimDesign` Step 2: Generate

Code

Generate <- function(condition, fixed_objects = NULL) {

  dist_info <- fixed_objects$proj_dists[[as.character(condition$id)]]
  n_sources <- condition$n_sources

  sim_points <- switch(
    dist_info$type,
    empirical = sample(
      dist_info$empirical,
      n_sources,
      replace = TRUE),
    
    norm = rnorm(
      n_sources,
      mean = dist_info$mean,
      sd = dist_info$sd),
    
    skewnorm = sn::rsn(
      n_sources,
      xi = dist_info$xi,
      omega = dist_info$omega,
      alpha = dist_info$alpha),
    
    stop("Unknown distribution type: ", dist_info$type)
  )

  data.frame(
    id = condition$id,
    sim_points = sim_points)
}

24.3.2.6 `SimDesign` Step 3: Analyze

Code

Analyse <- function(condition, dat, fixed_objects = NULL) {
  tibble::tibble(
    id        = condition$id,
    mean_pts  = mean(dat$sim_points, na.rm = TRUE),
    sd_pts    = sd(dat$sim_points, na.rm = TRUE),
    q10       = quantile(dat$sim_points, 0.10, na.rm = TRUE),
    q90       = quantile(dat$sim_points, 0.90, na.rm = TRUE),
    p100      = mean(dat$sim_points >= 100, na.rm = TRUE),
    p150      = mean(dat$sim_points >= 150, na.rm = TRUE),
    p200      = mean(dat$sim_points >= 200, na.rm = TRUE),
    p250      = mean(dat$sim_points >= 250, na.rm = TRUE),
    p300      = mean(dat$sim_points >= 300, na.rm = TRUE),
    p350      = mean(dat$sim_points >= 350, na.rm = TRUE)
  )
}

24.3.2.7 `SimDesign` Step 4: Summarize

Code

Summarise <- function(condition, results, fixed_objects = NULL) {
  dplyr::summarise(
    results,
    across(
      where(is.numeric),
      list(
        mean = ~mean(.x, na.rm = TRUE),
        sd   = ~sd(.x,  na.rm = TRUE)),
      .names = "{.col}_{.fn}"
    ),
    .groups = "drop"
  )
}

24.3.2.8 `SimDesign` Step 5: Run the Simulation

Now, we can run the model.

Note 24.1: Monte Carlo Simulation

Note: the following code that runs the simulation takes a while. If you just want to save time and load the results object instead of running the simulation, you can load the results object of the simulation (which has already been run) using this code:

Code

load(url("https://osf.io/download/ues7n/"))

Code

monteCarloSim_results <- SimDesign::runSimulation(
  design = Design,
  replications = 1000,
  generate = Generate,
  analyse = Analyse,
  summarise = Summarise,
  fixed_objects = list(proj_dists = proj_dists),
  seed = genSeeds(Design, iseed = 52242), # for reproducibility
  parallel = TRUE # for faster (parallel) processing
)

24.3.2.9 Simulation Results

Code

monteCarloSim_results <- monteCarloSim_results %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","position","team")],
    by = c("id" = "mfl_id")
  ) %>% 
  select(name, team, position, everything()) %>% 
  arrange(-mean_pts_mean)

The pX variable represent the probability that a player scoring more than X number of points. For example, the p300 variable represents the probability that each player scores more than 300 points. However, it is important to note that this is based on the distribution of projected points.

Code

monteCarloSim_results

Code

monteCarloSim_results %>% 
  filter(position == "QB")

Code

monteCarloSim_results %>% 
  filter(position == "RB")

Code

monteCarloSim_results %>% 
  filter(position == "WR")

Code

monteCarloSim_results %>% 
  filter(position == "TE")

Code

monteCarloSim_results %>% 
  filter(position %in% c("K","PK"))

Code

monteCarloSim_results %>% 
  filter(position %in% c("DL","DT","DE"))

Code

monteCarloSim_results %>% 
  filter(position %in% c("LB","MLB","OLB"))

Code

monteCarloSim_results %>% 
  filter(position %in% c("DB","S","CB"))

24.4 Conclusion

A simulation is an “imitative representation” of a phenomenon that could exist the real world. In statistics, simulations are computer-driven investigations to better understand a phenomenon by studying its behavior under different conditions. Statistical simulations can be conducted in various ways. Two common types of simulations are bootstrapping and Monte Carlo simulation. Bootstrapping involves repeated resampling (with replacement) from observed data. Monte Carlo simulation involves repeated random sampling from a known distribution. We demonstrated bootstrapping and Monte Carlo approaches to simulating the most likely range of outcomes for a player in terms of fantasy points.

24.5 Session Info

Code

sessionInfo()

R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] lubridate_1.9.4        forcats_1.0.0          stringr_1.5.1         
 [4] dplyr_1.1.4            purrr_1.1.0            readr_2.1.5           
 [7] tidyr_1.3.1            tibble_3.3.0           ggplot2_3.5.2         
[10] tidyverse_2.0.0        sn_2.1.1               fitdistrplus_1.2-4    
[13] survival_3.8-3         MASS_7.3-65            SimDesign_2.20.0      
[16] progressr_0.15.1       future.apply_1.20.0    future_1.58.0         
[19] data.table_1.17.8      ffanalytics_3.1.6.0000

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3  rstudioapi_0.17.1   audio_0.1-11       
  [4] jsonlite_2.0.0      magrittr_2.0.3      farver_2.1.2       
  [7] nloptr_2.2.1        rmarkdown_2.29      fs_1.6.6           
 [10] vctrs_0.6.5         minqa_1.2.8         base64enc_0.1-3    
 [13] htmltools_0.5.8.1   haven_2.5.5         cellranger_1.1.0   
 [16] Formula_1.2-5       parallelly_1.45.0   htmlwidgets_1.6.4  
 [19] plyr_1.8.9          testthat_3.2.3      httr2_1.2.1        
 [22] rootSolve_1.8.2.4   lifecycle_1.0.4     pkgconfig_2.0.3    
 [25] Matrix_1.7-3        R6_2.6.1            fastmap_1.2.0      
 [28] rbibutils_2.3       digest_0.6.37       Exact_3.3          
 [31] numDeriv_2016.8-1.1 colorspace_2.1-1    ps_1.9.1           
 [34] Hmisc_5.2-3         labeling_0.4.3      timechange_0.3.0   
 [37] httr_1.4.7          compiler_4.5.1      proxy_0.4-27       
 [40] withr_3.0.2         htmlTable_2.4.3     backports_1.5.0    
 [43] DBI_1.2.3           psych_2.5.6         R.utils_2.13.0     
 [46] rappdirs_0.3.3      sessioninfo_1.2.3   petersenlab_1.1.7  
 [49] gld_2.6.7           tools_4.5.1         chromote_0.5.1     
 [52] pbivnorm_0.6.0      foreign_0.8-90      nnet_7.3-20        
 [55] R.oo_1.27.1         glue_1.8.0          quadprog_1.5-8     
 [58] nlme_3.1-168        promises_1.3.3      grid_4.5.1         
 [61] checkmate_2.3.2     reshape2_1.4.4      cluster_2.1.8.1    
 [64] generics_0.1.4      gtable_0.3.6        tzdb_0.5.0         
 [67] R.methodsS3_1.8.2   class_7.3-23        websocket_1.4.4    
 [70] lmom_3.2            hms_1.1.3           xml2_1.3.8         
 [73] pillar_1.11.0       later_1.4.2         mitools_2.4        
 [76] splines_4.5.1       lattice_0.22-7      tidyselect_1.2.1   
 [79] pbapply_1.7-4       mix_1.0-13          knitr_1.50         
 [82] reformulas_0.4.1    gridExtra_2.3       xfun_0.52          
 [85] expm_1.0-0          brio_1.1.5          stringi_1.8.7      
 [88] yaml_2.3.10         boot_1.3-31         evaluate_1.0.4     
 [91] codetools_0.2-20    beepr_2.0           cli_3.6.5          
 [94] rpart_4.1.24        xtable_1.8-4        DescTools_0.99.60  
 [97] Rdpack_2.6.4        processx_3.8.6      lavaan_0.6-19      
[100] Rcpp_1.1.0          readxl_1.4.5        globals_0.18.0     
[103] parallel_4.5.1      lme4_1.1-37         listenv_0.9.1      
[106] viridisLite_0.4.2   mvtnorm_1.3-3       scales_1.4.0       
[109] e1071_1.7-16        rrapply_1.2.7       rlang_1.1.6        
[112] rvest_1.0.4         mnormt_2.1.1

Azzalini, A. (2023). sn: The skew-normal and related distributions such as the Skew-t and the SUN. https://doi.org/10.32614/CRAN.package.sn

Azzalini, A. A. (2023). The R package sn: The skew-normal and related distributions such as the skew-$t$ and the SUN (version 2.1.1). https://cran.r-project.org/package=sn

Barrett, T., Dowle, M., Srinivasan, A., Gorecki, J., Chirico, M., Hocking, T., Schwendinger, B., & Krylov, I. (2025). data.table: Extension of ‘data.frame‘. https://doi.org/10.32614/CRAN.package.data.table

Bengtsson, H. (2024). progressr: An inclusive, unifying API for progress updates. https://doi.org/10.32614/CRAN.package.progressr

Bengtsson, H. (2025a). ‘future‘: Unified parallel and distributed processing in ‘R‘ for everyone. https://doi.org/10.32614/CRAN.package.future

Bengtsson, H. (2025b). ‘future.apply‘: Apply function to elements in parallel using futures. https://doi.org/10.32614/CRAN.package.future.apply

Chalmers, R. P. (2025). SimDesign: Structure for organizing Monte Carlo simulation designs. https://doi.org/10.32614/CRAN.package.SimDesign

Chalmers, R. P., & Adkins, M. C. (2020). Writing effective and reliable Monte Carlo simulations with the SimDesign package. The Quantitative Methods for Psychology, 16(4), 248–280. https://doi.org/10.20982/tqmp.16.4.p248

Corston, R., & Colman, A. M. (2000). A crash course in SPSS for Windows. Wiley-Blackwell.

Delignette-Muller, M. L., & Dutang, C. (2015). fitdistrplus: An R package for fitting distributions. Journal of Statistical Software, 64(4), 1–34. https://doi.org/10.18637/jss.v064.i04

Delignette-Muller, M.-L., Dutang, C., & Siberchicot, A. (2025). fitdistrplus: Help to fit of a parametric distribution to non-censored or censored data. https://doi.org/10.32614/CRAN.package.fitdistrplus

Sigal, M. J., & Chalmers, R. P. (2016). Play it again: Teaching statistics with Monte Carlo simulation. Journal of Statistics Education, 24(3), 136–156. https://doi.org/10.1080/10691898.2016.1246953

24 Simulation: Bootstrapping and the Monte Carlo Method

24.1 Getting Started

24.1.1 Load Packages

24.1.2 Load Data

24.2 Overview

24.2.1 Bootstrapping

24.2.2 Monte Carlo Simulation

24.3 Simulation of Projected Statistics and Points

24.3.1 Bootstrapping

24.3.1.1 Prepare Data

24.3.1.2 Bootstrapping Function

24.3.1.3 Run the Bootstrapping Simulation

24.3.1.4 Score Fantasy Points from the Simulation

24.3.1.5 Summarize Players’ Distribution of Projected Fantasy Points

24.3.1.6 View Players’ Distribution of Projected Fantasy Points

24.3.2 Monte Carlo Simulation

24.3.2.1 `SimDesign` Package

24.3.2.2 Prepare Data

24.3.2.3 Optimal Distribution for Each Player

24.3.2.4 `SimDesign` Step 1: Design Grid

24.3.2.5 `SimDesign` Step 2: Generate

24.3.2.6 `SimDesign` Step 3: Analyze

24.3.2.7 `SimDesign` Step 4: Summarize

24.3.2.8 `SimDesign` Step 5: Run the Simulation

24.3.2.9 Simulation Results

24.4 Conclusion

24.5 Session Info

Feedback

Email Notification

24.1 Getting Started

24.1.1 Load Packages

24.1.2 Load Data

24.2 Overview

24.2.1 Bootstrapping

24.2.2 Monte Carlo Simulation

24.3 Simulation of Projected Statistics and Points

24.3.1 Bootstrapping

24.3.1.1 Prepare Data

24.3.1.2 Bootstrapping Function

24.3.1.3 Run the Bootstrapping Simulation

24.3.1.4 Score Fantasy Points from the Simulation

24.3.1.5 Summarize Players’ Distribution of Projected Fantasy Points

24.3.1.6 View Players’ Distribution of Projected Fantasy Points

24.3.2 Monte Carlo Simulation

24.3.2.1 SimDesign Package

24.3.2.2 Prepare Data

24.3.2.3 Optimal Distribution for Each Player

24.3.2.4 SimDesign Step 1: Design Grid

24.3.2.5 SimDesign Step 2: Generate

24.3.2.6 SimDesign Step 3: Analyze

24.3.2.7 SimDesign Step 4: Summarize

24.3.2.8 SimDesign Step 5: Run the Simulation

24.3.2.9 Simulation Results

24.4 Conclusion

24.5 Session Info

Feedback

Email Notification

24.3.2.1 `SimDesign` Package

24.3.2.4 `SimDesign` Step 1: Design Grid

24.3.2.5 `SimDesign` Step 2: Generate

24.3.2.6 `SimDesign` Step 3: Analyze

24.3.2.7 `SimDesign` Step 4: Summarize

24.3.2.8 `SimDesign` Step 5: Run the Simulation