I need your help!

I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.

You can leave a comment at the bottom of the page/chapter, or open an issue or submit a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook

Hypothesis Alternatively, you can leave an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.

24  Simulation: Bootstrapping and the Monte Carlo Method

This chapter provides an overview of various approaches to simulation, including bootstrapping and the Monte Carlo method.

24.1 Getting Started

24.1.1 Load Packages

Code
library("ffanalytics")
library("data.table")
library("future")
library("future.apply")
library("progressr")
library("SimDesign")
library("fitdistrplus")
library("sn")
library("tidyverse")

24.1.2 Load Data

Code
load(file = "./data/players_projectedPoints_seasonal.RData")
load(file = "./data/nfl_playerIDs.RData")

24.2 Overview

A simulation is an “imitative representation” of a phenomenon that could exist the real world. In statistics, simulations are computer-driven investigations to better understand a phenomenon by studying its behavior under different conditions. For instance, we might want to determine the likely range of outcomes for a player, in terms of the range of fantasy points that a player might score over the course of a season. Simulations can be conducted in various ways. Two common ways of conducting simulations are via bootstrapping and via Monte Carlo simulation.

24.2.1 Bootstrapping

Bootstrapping involves repeated resampling (with replacement) from observed data. For instance, if we have 100 sources provide projections for a player, we could estimate the most likely range of fantasy points for the player by repeatedly sampling from the 100 projections.

24.2.2 Monte Carlo Simulation

Monte Carlo simulation involves repeated random sampling from a known distribution (Sigal & Chalmers, 2016). For instance, if we know the population distribution for the likely outcomes for a player—e.g., a normal distribution with a known mean (e.g., 150 points) and standard deviation (e.g., 20 points)—we can repeatedly sample randomly from this distribution. The distribution could be, as examples, a normal distribution, a log-normal distribution, a binomial distribution, a chi-square distribution, etc. (Sigal & Chalmers, 2016). The distribution provides a probability density function, which indicates the probability that any particular value would be observed if the data arose from that distribution.

24.3 Simulation of Projected Statistics and Points

Below, we perform bootstrapping and Monte Carlo simulations of projected statistics and points. However, it is worth noting that—as for any simulation—the quality of the results depend on the quality of the inputs. In this case, the quality of the simulation depends on the quality of the projections. If the projections are no good, the simulation results will not be trustworthy. Garbage in, garbage out. As we evaluated in Section 17.12, projections tend to show moderate accuracy for fantasy performance, but they are not highly accurate. Thus, we should treat simulation results arising from fantasy projections with a good dose of skepticism.

24.3.1 Bootstrapping

24.3.1.1 Prepare Data

Code
all_proj <- dplyr::bind_rows(players_projectedPoints_seasonal)
Code
vars_by_pos <- list(
  QB = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "fumbles_lost", "fumbles_total", "two_pts",
    "sacks",
    "pass_09_tds", "pass_1019_tds", "pass_2029_tds", "pass_3039_tds", "pass_4049_tds", "pass_50_tds",
    "pass_40_yds", "pass_250_yds", "pass_300_yds", "pass_350_yds", "pass_400_yds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds"
    ),
  RB = c(
    "games",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  WR = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  TE = c(
    "games",
    "pass_att", "pass_comp", "pass_inc", "pass_yds", "pass_tds", "pass_int",
    "rush_att", "rush_yds", "rush_tds",
    "rec_tgt", "rec", "rec_yds", "rec_tds", "rec_rz_tgt",
    "fumbles_lost", "fumbles_total", "two_pts",
    "return_yds", "return_tds",
    "rush_09_tds", "rush_1019_tds", "rush_2029_tds", "rush_3039_tds", "rush_4049_tds", "rush_50_tds",
    "rush_40_yds", "rush_50_yds", "rush_100_yds", "rush_150_yds", "rush_200_yds",
    "rec_40_yds", "rec_50_yds", "rec_100_yds", "rec_150_yds", "rec_200_yds"
  ),
  K = c(
    "fg_0019", "fg_2029", "fg_3039", "fg_4049", "fg_50", "fg_50_att",
    "fg_39", "fg_att_39", "fg_49", "fg_49_att",
    "fg", "fg_att", "fg_miss", "xp", "xp_att"
  ),
  D = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DL = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  LB = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DB = c(
    "idp_solo", "idp_asst", "idp_sack", "idp_int", "idp_fum_force", "idp_fum_rec", "idp_pd", "idp_td", "idp_safety"
  ),
  DST = c(
    "dst_fum_recvr", "dst_fum_rec", "dst_int", "dst_safety", "dst_sacks", "dst_td", "dst_blk",
    "dst_fumbles", "dst_tackles", "dst_yds_against", "dst_pts_against", "dst_pts_allowed", "dst_ret_yds"
  )
)

24.3.1.2 Bootstrapping Function

For performing the bootstrapping, we leverage the data.table (Barrett et al., 2025) (data.table::as.data.table(); data.table::data.table(); data.table::rbindlist()), future (Bengtsson, 2025a) (future::plan(); future::multisession()), and future.apply (Bengtsson, 2025b) (future.apply::future_lapply()) packages for speed (by using parallel processing) and memory efficiency. We use the progressr (Bengtsson, 2024) package (progressr::handlers(); progressr::with_progress(); progressr::progressor()) to create a progress bar.

Code
bootstrapSimulation <- function(
    projectedStats,
    vars_by_pos,
    n_iter = 10000,
    seed = NULL,
    progress = TRUE) {
  
  dt <- data.table::as.data.table(projectedStats) # use data.table for speed
  all_ids <- unique(dt$id)
  
  if (!is.null(seed)) set.seed(seed)
  
  future::plan(future::multisession) # parallelize tasks across multiple background R sessions using multiple cores to speed up simulation
  
  if (progress) progressr::handlers("txtprogressbar") # specify progress-bar style
  
  results <- progressr::with_progress({ # wrap in with_progress for progress bar
    p <- if (progress) progressr::progressor(along = all_ids) else NULL # create progressor for progress bar
    
    future.apply::future_lapply(
      all_ids, # apply the function below to each player using a parallelized loop
      function(player_id) {
        if (!is.null(p)) p() # advance progress bar
        
        player_data <- dt[id == player_id]
        player_pos  <- unique(player_data$pos)
        
        if (length(player_pos) != 1 || !player_pos %in% names(vars_by_pos)) return(NULL)
        
        stat_vars <- vars_by_pos[[player_pos]] # pull the relevant stat variables to simulate for this player's position
        out <- data.table(iteration = seq_len(n_iter), id = player_id, pos = player_pos)
        
        for (var in stat_vars) { # loop over each stat variable that should be simulated for the player's position
          if (var %in% names(player_data)) {
            non_na_values <- player_data[[var]][!is.na(player_data[[var]])] # pull non-missing values of the stat for the player (from all projection sources)
            
            if (length(non_na_values) > 0) {
              out[[var]] <- sample(non_na_values, n_iter, replace = TRUE) # if there are valid values, sample with replacement to simulate n_iter values
            } else {
              out[[var]] <- NA_real_ # specify a numeric missing value (if all values were missing)
            }
          } else {
            out[[var]] <- NA_real_ # specify a numeric missing value (if the stat variable doesn't exist)
          }
        }
        
        return(out)
      },
      future.seed = TRUE # ensures that each parallel process gets a reproducible random seed
    )
  })
  
  data.table::rbindlist(results, use.names = TRUE, fill = TRUE) # combines all the individual player results into one large data table, aligning columns by name; fill = TRUE ensures that missing columns are filled with NA where necessary
}

24.3.1.3 Run the Bootstrapping Simulation

Code
bootstappedStats <- bootstrapSimulation(
  projectedStats = all_proj,
  vars_by_pos = vars_by_pos,
  n_iter = 5000,
  seed = 52242)

24.3.1.4 Score Fantasy Points from the Simulation

data.table::setnames()

Code
data.table::setnames(bootstappedStats, "iteration", "data_src") # data.table equivalent to: bootstappedStats$data_src <- bootstappedStats$iteration

bootstappedStatsByPosition <- split(
  bootstappedStats,
  by = "pos",
  keep.by = TRUE)

base::lapply()

Code
bootstappedStatsByPosition <- lapply(
  bootstappedStatsByPosition,
  setDF)

attr(bootstappedStatsByPosition, "season") <- 2024
attr(bootstappedStatsByPosition, "week") <- 0
Code
bootstrappedFantasyPoints <- ffanalytics:::source_points(
  data_result = bootstappedStatsByPosition,
  scoring_rules = ffanalytics::scoring)
Code
bootstrappedFantasyPoints$iteration <- bootstrappedFantasyPoints$data_src
bootstrappedFantasyPoints$data_src <- NULL

bootstrappedFantasyPoints <- bootstrappedFantasyPoints %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","team")],
    by = c("id" = "mfl_id")
  )

bootstrappedFantasyPoints <- bootstrappedFantasyPoints %>% 
  rename(projectedPoints = raw_points)

24.3.1.5 Summarize Players’ Distribution of Projected Fantasy Points

Code
bootstrappedFantasyPoints_summary <- bootstrappedFantasyPoints %>% 
  group_by(id) %>% 
  summarise(
    mean = mean(projectedPoints, na.rm = TRUE),
    SD = sd(projectedPoints, na.rm = TRUE),
    min = min(projectedPoints, na.rm = TRUE),
    max = max(projectedPoints, na.rm = TRUE),
    q10 = quantile(projectedPoints, .10, na.rm = TRUE), # 10th quantile
    q90 = quantile(projectedPoints, .90, na.rm = TRUE), # 90th quantile
    range = max(projectedPoints, na.rm = TRUE) - min(projectedPoints, na.rm = TRUE),
    IQR = IQR(projectedPoints, na.rm = TRUE),
    MAD = mad(projectedPoints, na.rm = TRUE),
    CV = SD/mean,
    median = median(projectedPoints, na.rm = TRUE),
    pseudomedian = DescTools::HodgesLehmann(projectedPoints, na.rm = TRUE),
    mode = petersenlab::Mode(projectedPoints, multipleModes = "mean"),
    skewness = psych::skew(projectedPoints, na.rm = TRUE),
    kurtosis = psych::kurtosi(projectedPoints, na.rm = TRUE)
  )

24.3.1.6 View Players’ Distribution of Projected Fantasy Points

Code
bootstrappedFantasyPoints_summary <- bootstrappedFantasyPoints_summary %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","team","position")],
    by = c("id" = "mfl_id")
  ) %>% 
  select(name, team, position, mean:kurtosis, everything()) %>% 
  arrange(-mean)
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position == "QB") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position == "RB") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position == "WR") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position == "TE") %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position == c("K","PK")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("DL","DT","DE")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("LB","MLB","OLB")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))
Code
bootstrappedFantasyPoints_summary %>% 
  filter(position %in% c("DB","S","CB")) %>% 
  mutate(
    across(
      where(is.numeric),
      \(x) round(x, digits = 2)))

An example distribution of projected fantasy points is in Figure 24.1.

Code
ggplot2::ggplot(
  data = bootstrappedFantasyPoints %>%
    filter(pos == "QB" & name == "Patrick Mahomes"),
  mapping = aes(
    x = projectedPoints)
) +
  geom_histogram(
    aes(y = after_stat(density)),
    color = "#000000",
    fill = "#0099F8"
  ) +
  geom_density(
    color = "#000000",
    fill = "#F85700",
    alpha = 0.6 # add transparency
  ) +
  geom_rug() +
  #coord_cartesian(
  #  xlim = c(0,400)) +
  labs(
    x = "Fantasy Points",
    y = "Density",
    title = "Distribution of Projected Fantasy Points for Patrick Mahomes"
  ) +
  theme_classic() +
  theme(axis.title.y = element_text(angle = 0, vjust = 0.5)) # horizontal y-axis title
Distribution of Projected Fantasy Points for Patrick Mahomes from Bootstrapping.
Figure 24.1: Distribution of Projected Fantasy Points for Patrick Mahomes from Bootstrapping.

Projections of two players—one with relatively narrow uncertainty and one with relatively wide uncertainty—are depicted in Figure 24.2.

Code
ggplot2::ggplot(
  data = bootstrappedFantasyPoints %>%
    filter(pos == "QB" & (name %in% c("Dak Prescott", "Drake Maye"))),
  mapping = aes(
    x = projectedPoints,
    group = name,
    #color = name,
    fill = name)
) +
  geom_histogram(
    aes(y = after_stat(density))
  ) +
  geom_density(
    alpha = 0.6 # add transparency
  ) +
  coord_cartesian(
    xlim = c(0,NA),
    expand = FALSE) +
  #geom_rug() +
  labs(
    x = "Fantasy Points",
    y = "Density",
    fill = "",
    color = "",
    title = "Distribution of Projected Fantasy Points"
  ) +
  theme_classic() +
  theme(axis.title.y = element_text(angle = 0, vjust = 0.5)) # horizontal y-axis title
Distribution of Projected Fantasy Points for Two Players from Bootstrapping. There is relatively narrow uncertainty around projected fantasy points for Dak Prescott, whereas there is relatively wide uncertainty around the projected fantasy points for Drake Maye.
Figure 24.2: Distribution of Projected Fantasy Points for Two Players from Bootstrapping. There is relatively narrow uncertainty around projected fantasy points for Dak Prescott, whereas there is relatively wide uncertainty around the projected fantasy points for Drake Maye.

24.3.2 Monte Carlo Simulation

24.3.2.1 SimDesign Package

You can generate a template for Monte Carlo simulations in the SimDesign (Chalmers, 2025; Chalmers & Adkins, 2020) package using the following code:

24.3.2.2 Prepare Data

Code
all_proj <- all_proj %>% 
  rename(projectedPoints = raw_points)

24.3.2.3 Optimal Distribution for Each Player

For each player, we identify the optimal distribution as either a normal distribution , or as a skew-normal distribution. The normal distribution was fit using the fitdistrplus::fitdist() function of the fitdistrplus package (Delignette-Muller & Dutang, 2015; Delignette-Muller et al., 2025). The skew-normal distribution was fit using the sn::selm() function of the sn package (A. Azzalini, 2023; A. A. Azzalini, 2023).

Code
# Function to identify the "best" distribution (Normal vs Skew‑Normal) for every player (tries both families and picks by AIC; uses empirical distribution if fewer than 2 unique scores)
fit_best <- function(x) {
  # Basic checks
  if (length(unique(x)) < 2 || all(is.na(x))) { # Use empirical distribution if there are fewer than 2 unique scores
    return(list(type = "empirical", empirical = x))
  }

  # Try Normal Distribution
  fit_norm <- tryCatch(
    fitdistrplus::fitdist(x, distr = "norm"),
    error = function(e) NULL
  )

  # Try Skew-Normal Distribution
  fit_skew <- tryCatch(
    sn::selm(x ~ 1),
    error = function(e) NULL
  )

  # Handle bad fits: sd = NA, etc.
  if (!is.null(fit_norm) && any(is.na(fit_norm$estimate))) {
    fit_norm <- NULL
  }

  if (!is.null(fit_skew)) {
    pars <- tryCatch(sn::coef(fit_skew, param.type = "dp"), error = function(e) NULL)
    if (is.null(pars) || any(is.na(pars))) {
      fit_skew <- NULL
    }
  }

  # Choose best available
  if (!is.null(fit_norm) && !is.null(fit_skew)) {
    aic_norm <- AIC(fit_norm)
    aic_skew <- AIC(fit_skew)
    if (aic_skew + 2 < aic_norm) { # skew-normal is more complex (has more parameters) than normal distribution, so only select a skew-normal distribution if it fits substantially better than a normal distribution
      pars <- sn::coef(fit_skew, param.type = "dp")
      return(list(
        type  = "skewnorm",
        xi    = pars["dp.location"],
        omega = pars["dp.scale"],
        alpha = pars["dp.shape"]))
    } else {
      return(list(
        type = "norm",
        mean = fit_norm$estimate["mean"],
        sd   = fit_norm$estimate["sd"]))
    }
  } else if (!is.null(fit_norm)) {
    return(list(
      type = "norm",
      mean = fit_norm$estimate["mean"],
      sd   = fit_norm$estimate["sd"]))
  } else {
    return(list(type = "empirical", empirical = x))
  }
}
Code
proj_dists_tbl <- all_proj %>%
  filter(!is.na(id) & id != "") %>%
  group_by(id) %>%
  summarise(
    dist_info = list(fit_best(projectedPoints)),
    n_proj = n(), # record of how many sources they have
    .groups = "drop"
  )

proj_dists <- proj_dists_tbl %>%
  filter(!is.na(id) & id != "") %>%
  distinct(id, .keep_all = TRUE) %>%
  (\(x) setNames(x$dist_info, x$id))()
Code
proj_dists_tbl %>%
  dplyr::mutate(
    dist_type = purrr::map_chr(dist_info, ~ .x$type)
  ) %>%
  dplyr::count(dist_type)

24.3.2.4 SimDesign Step 1: Design Grid

Now we build the SimDesign design grid based on the number of projections that each player had.

Code
Design <- proj_dists_tbl %>%
  dplyr::mutate(
    id,
    n_sources = n_proj,
    .keep = "none"
  )
Code
missing_ids <- setdiff(Design$id, names(proj_dists))
length(missing_ids) # should be 0

any(is.na(proj_dists_tbl$id)) # should be FALSE
any(is.na(Design$id))         # should be FALSE
any(is.na(names(proj_dists))) # should be FALSE

24.3.2.5 SimDesign Step 2: Generate

Code
Generate <- function(condition, fixed_objects = NULL) {

  dist_info <- fixed_objects$proj_dists[[as.character(condition$id)]]
  n_sources <- condition$n_sources

  sim_points <- switch(
    dist_info$type,
    empirical = sample(
      dist_info$empirical,
      n_sources,
      replace = TRUE),
    
    norm = rnorm(
      n_sources,
      mean = dist_info$mean,
      sd = dist_info$sd),
    
    skewnorm = sn::rsn(
      n_sources,
      xi = dist_info$xi,
      omega = dist_info$omega,
      alpha = dist_info$alpha),
    
    stop("Unknown distribution type: ", dist_info$type)
  )

  data.frame(
    id = condition$id,
    sim_points = sim_points)
}

24.3.2.6 SimDesign Step 3: Analyze

Code
Analyse <- function(condition, dat, fixed_objects = NULL) {
  tibble::tibble(
    id        = condition$id,
    mean_pts  = mean(dat$sim_points, na.rm = TRUE),
    sd_pts    = sd(dat$sim_points, na.rm = TRUE),
    q10       = quantile(dat$sim_points, 0.10, na.rm = TRUE),
    q90       = quantile(dat$sim_points, 0.90, na.rm = TRUE),
    p100      = mean(dat$sim_points >= 100, na.rm = TRUE),
    p150      = mean(dat$sim_points >= 150, na.rm = TRUE),
    p200      = mean(dat$sim_points >= 200, na.rm = TRUE),
    p250      = mean(dat$sim_points >= 250, na.rm = TRUE),
    p300      = mean(dat$sim_points >= 300, na.rm = TRUE),
    p350      = mean(dat$sim_points >= 350, na.rm = TRUE)
  )
}

24.3.2.7 SimDesign Step 4: Summarize

Code
Summarise <- function(condition, results, fixed_objects = NULL) {
  dplyr::summarise(
    results,
    across(
      where(is.numeric),
      list(
        mean = ~mean(.x, na.rm = TRUE),
        sd   = ~sd(.x,  na.rm = TRUE)),
      .names = "{.col}_{.fn}"
    ),
    .groups = "drop"
  )
}

24.3.2.8 SimDesign Step 5: Run the Simulation

Now, we can run the model using the SimDesign::runSimulation() function.

Note 24.1: Monte Carlo Simulation

Note: the following code that runs the simulation takes a while. If you just want to save time and load the results object instead of running the simulation, you can load the results object of the simulation (which has already been run) using this code:

Code
load(url("https://osf.io/download/ues7n/"))
Code
monteCarloSim_results <- SimDesign::runSimulation(
  design = Design,
  replications = 1000,
  generate = Generate,
  analyse = Analyse,
  summarise = Summarise,
  fixed_objects = list(proj_dists = proj_dists),
  seed = SimDesign::genSeeds(Design, iseed = 52242), # for reproducibility
  parallel = TRUE # for faster (parallel) processing
)

24.3.2.9 Simulation Results

Code
monteCarloSim_results <- monteCarloSim_results %>% 
  left_join(
    nfl_playerIDs[,c("mfl_id","name","merge_name","position","team")],
    by = c("id" = "mfl_id")
  ) %>% 
  select(name, team, position, everything()) %>% 
  arrange(-mean_pts_mean)

The pX variable represent the probability that a player scoring more than X number of points. For example, the p300 variable represents the probability that each player scores more than 300 points. However, it is important to note that this is based on the distribution of projected points.

Code
monteCarloSim_results
Code
monteCarloSim_results %>% 
  filter(position == "QB")
Code
monteCarloSim_results %>% 
  filter(position == "RB")
Code
monteCarloSim_results %>% 
  filter(position == "WR")
Code
monteCarloSim_results %>% 
  filter(position == "TE")
Code
monteCarloSim_results %>% 
  filter(position %in% c("K","PK"))
Code
monteCarloSim_results %>% 
  filter(position %in% c("DL","DT","DE"))
Code
monteCarloSim_results %>% 
  filter(position %in% c("LB","MLB","OLB"))
Code
monteCarloSim_results %>% 
  filter(position %in% c("DB","S","CB"))

24.4 Conclusion

A simulation is an “imitative representation” of a phenomenon that could exist the real world. In statistics, simulations are computer-driven investigations to better understand a phenomenon by studying its behavior under different conditions. Statistical simulations can be conducted in various ways. Two common types of simulations are bootstrapping and Monte Carlo simulation. Bootstrapping involves repeated resampling (with replacement) from observed data. Monte Carlo simulation involves repeated random sampling from a known distribution. We demonstrated bootstrapping and Monte Carlo approaches to simulating the most likely range of outcomes for a player in terms of fantasy points.

24.5 Session Info

Code
sessionInfo()
R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] lubridate_1.9.4         forcats_1.0.1           stringr_1.5.2          
 [4] dplyr_1.1.4             purrr_1.1.0             readr_2.1.5            
 [7] tidyr_1.3.1             tibble_3.3.0            ggplot2_4.0.0          
[10] tidyverse_2.0.0         sn_2.1.1                fitdistrplus_1.2-4     
[13] survival_3.8-3          MASS_7.3-65             SimDesign_2.21         
[16] progressr_0.17.0        future.apply_1.20.0     future_1.67.0          
[19] data.table_1.17.8       ffanalytics_3.1.13.0000

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3  rstudioapi_0.17.1   audio_0.1-11       
  [4] jsonlite_2.0.0      magrittr_2.0.4      farver_2.1.2       
  [7] nloptr_2.2.1        rmarkdown_2.30      fs_1.6.6           
 [10] vctrs_0.6.5         minqa_1.2.8         base64enc_0.1-3    
 [13] htmltools_0.5.8.1   haven_2.5.5         cellranger_1.1.0   
 [16] Formula_1.2-5       parallelly_1.45.1   htmlwidgets_1.6.4  
 [19] plyr_1.8.9          testthat_3.2.3      httr2_1.2.1        
 [22] rootSolve_1.8.2.4   lifecycle_1.0.4     pkgconfig_2.0.3    
 [25] Matrix_1.7-3        R6_2.6.1            fastmap_1.2.0      
 [28] rbibutils_2.3       digest_0.6.37       Exact_3.3          
 [31] numDeriv_2016.8-1.1 colorspace_2.1-2    ps_1.9.1           
 [34] Hmisc_5.2-4         labeling_0.4.3      timechange_0.3.0   
 [37] httr_1.4.7          compiler_4.5.1      proxy_0.4-27       
 [40] withr_3.0.2         htmlTable_2.4.3     S7_0.2.0           
 [43] backports_1.5.0     DBI_1.2.3           psych_2.5.6        
 [46] R.utils_2.13.0      rappdirs_0.3.3      sessioninfo_1.2.3  
 [49] petersenlab_1.2.0   gld_2.6.8           tools_4.5.1        
 [52] chromote_0.5.1      pbivnorm_0.6.0      foreign_0.8-90     
 [55] otel_0.2.0          clipr_0.8.0         nnet_7.3-20        
 [58] R.oo_1.27.1         glue_1.8.0          quadprog_1.5-8     
 [61] nlme_3.1-168        promises_1.4.0      grid_4.5.1         
 [64] checkmate_2.3.3     reshape2_1.4.4      cluster_2.1.8.1    
 [67] generics_0.1.4      gtable_0.3.6        tzdb_0.5.0         
 [70] R.methodsS3_1.8.2   class_7.3-23        websocket_1.4.4    
 [73] lmom_3.2            hms_1.1.4           xml2_1.4.0         
 [76] pillar_1.11.1       later_1.4.4         mitools_2.4        
 [79] splines_4.5.1       lattice_0.22-7      tidyselect_1.2.1   
 [82] pbapply_1.7-4       mix_1.0-13          knitr_1.50         
 [85] reformulas_0.4.1    gridExtra_2.3       xfun_0.53          
 [88] expm_1.0-0          brio_1.1.5          stringi_1.8.7      
 [91] yaml_2.3.10         boot_1.3-31         evaluate_1.0.5     
 [94] codetools_0.2-20    beepr_2.0           cli_3.6.5          
 [97] rpart_4.1.24        xtable_1.8-4        DescTools_0.99.60  
[100] Rdpack_2.6.4        processx_3.8.6      lavaan_0.6-20      
[103] Rcpp_1.1.0          readxl_1.4.5        globals_0.18.0     
[106] parallel_4.5.1      lme4_1.1-37         listenv_0.9.1      
[109] viridisLite_0.4.2   mvtnorm_1.3-3       scales_1.4.0       
[112] e1071_1.7-16        rrapply_1.2.7       rlang_1.1.6        
[115] rvest_1.0.5         mnormt_2.1.1       

Feedback

Please consider providing feedback about this textbook, so that I can make it as helpful as possible. You can provide feedback at the following link: https://forms.gle/LsnVKwqmS1VuxWD18

Email Notification

The online version of this book will remain open access. If you want to know when the print version of the book is for sale, enter your email below so I can let you know.