I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.
Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook
Adding an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.
1 Introduction
1.1 About this Book
How can we use information to make predictions about uncertain events? This book is about empiricism (basing theories on observed data) and judgment, prediction, and decision making in the context of uncertainty. The book provides an introduction to modern analytical techniques used to make informed predictions, test theories, and draw conclusions from a given dataset. The book leverages the software R
for providing applied data analysis examples.
This book was originally written for a undergraduate-level course entitled, “Fantasy Football: Predictive Analytics and Empiricism”. The chapters provide an overview of topics that each could have its own class and textbook, such as causal inference, factor analysis, cluster analysis, principal component analysis, machine learning, cognitive biases, modern portfolio theory, data visualization, simulation, etc. The book gives readers an overview of the breadth of the approaches to prediction and empiricism. As a consequence, the book does not cover any one technique or approach in great depth.
1.2 What is Fantasy Football?
Fantasy football is an online game where participants assemble (i.e., “draft”) imaginary teams composed of real-life National Football League (NFL) players. In this game, participants compete against their opponents (e.g., friends/coworkers/classmates), accumulating points based on players’ actual statistical performances in games. The goal is to outscore one’s opponent each week to win matches and ultimately claim victory in the league.
1.3 Why Focus on Fantasy Football?
I was fortunate to have an excellent instructor who taught me the value of learning statistics to answer interesting and important questions. That is, I do not find statistics intrinsically interesting; rather, I find them interesting because of what they allow me to do. Many students find statistics intimidating in part because of how it is typically taught—with examples like dice rolls and coin flips that are (seemingly irrelevant and) boring to students. My contention is that applied examples are a more effective lens to teach many concepts in psychology and data analysis. It can be more engaging and relatable to learn statistics in the applied context of sports, a domain that is more intuitive to many. Many people play fantasy sports. This book involves applying statistics to a particular domain (football). People actually want to learn statistical principles and methods when they can apply them to interesting questions (e.g., sports). In my opinion [and supported by evidence; Motz (2013)], this is a much more effective way of engaging people and teaching statistics than in the context of abstract coin flips and dice rolls. Fantasy football relies heavily on prediction—trying to predict which players will perform best and selecting them accordingly. In this way, fantasy football provides a plethora of decision making opportunities in the face of uncertainty, and a wealth of data for analyzing these decisions. However, unlike many other applied domains in psychology, fantasy football (1) allows a person to see the accuracy of their predictions on a timely basis and (2) provides a safe environment for friendly competition. Thus, it provides a unique domain to evaluate—and improve—the accuracy of various prediction models.
1.4 Why R?
The book provides data analysis examples using the statistical analysis software, R
. Why R
?
R
is free! Anyone can use it.R
is open source—it is not a black box. You can see what is going on “under the hood” and can examine the code for any function or computation you perform. You can even modify and improve these functions by changing the code, and you can create your own functions.R
is open platform—you can use it on multiple platforms, including Windows, MacOS, and Linux.R
has advanced statistics capabilities. It was designed for statistical analysis and has strong capabalities for data wrangling.R
has capabilities for state-of-the-art graphics. It has advanced capabilities for creating statistical graphics.R
is widely used—there is a large community of people who useR
for data analysis that you can draw upon for help from others.R
analyses are based on code (rather than a graphical user interface), which allows reproducibility—with the same data, code, and setup (platform,R
version, package versions, etc.), you should get the same answer every time. There are strong resources available for ensuring your analyses inR
are reproducible by others (Gandrud, 2020).- Anyone (including you) can contribute
R
packages to the community to improve its functionality. Statistical experts from all over the world have contributed open source packages toR
for specialized tasks. In the chance there is not anR
package that does what you need to do, you can write a function to perform the task and can contribute it as a package to the community for others to use and improve. The number ofR
packages contributed to the community is growing at a rapid rate. As of this writing, over 20,000 packages have been contributed to the Comprehensive R Archive Network (CRAN). And many more are stored on publicly available version control repositories like GitHub and GitLab. Chances are, if there is an analysis you need to do, anR
package exists to do it.
1.5 Educational Value
Skills in statistics, statistical programming, and data analysis are highly valuable. This book includes practical and conceptual tools that build a foundation for critical thinking. The book aims to help readers evaluate theory in the light of evidence (and vice versa) and to refine decision making in the context of uncertainty. Readers will learn about the ways that psychological science (and related disciplines) poses questions, formulates hypotheses, designs studies to test those questions, and interprets the findings, collectively with the aim to answer questions, improve decision making, and solve problems.
Of course, this is not a traditional psychology textbook. However, the book incorporates important psychological concepts, such as cognitive biases in judgment and prediction, etc. In the modern world of big data, research and society need people who know how to make sense of the information around us. Psychology is in a prime position to teach applied statistics to a wide variety of students, most of whom will not have careers as psychologists. Psychology can teach the importance of statistics given humans’ cognitive biases. It can also teach about how these biases can influence how people interpret statistics. This book will teach readers the applications of statistics (prediction) and research methods (empiricism) to answer questions they find interesting, while applying scientific and psychological rigor.
1.6 Learning Objectives
This book aims to help readers accomplish the following learning objectives:
- Apply empirical inference and appreciate the value it provides over speculative supposition.
- Ask educated questions when confronted with decisions in the face of uncertainty.
- Understand human decision making, including common heuristics and cognitive biases and how to mitigate them analytically.
- Engage in critical thinking about causality, including devising plausible alternative explanations for observed effects.
- Understand causal inference including confounding, causal pathways, and counterfactuals.
- Think empirically about human behavior and performance.
- Describe the strengths and weaknesses of humans versus computers in prediction scenarios.
- Apply basic skills in statistical programming using
R
to manipulate and summarize datasets and to conduct data analysis. - Critically evaluate the strengths and limitations of different statistical models and methodologies used in predicting uncertain events, enhancing their understanding of statistical inference and model selection.
- Use various analytical techniques for predicting the outcome of uncertain events, and for uncovering latent causes of patterns in observed data.
- Interpret findings from various statistical approaches and evaluate the accuracy of predictions.
- Engage in iterative problem-solving processes, refining analytical approaches based on feedback and outcomes, and adapting strategies accordingly.
- Communicate statistical findings and analyses in both written and oral formats, demonstrating proficiency in presenting complex information to diverse audiences.
- Make sense of big data.
- Use practical analytical skills that can be applied in future research and job settings.
1.7 Disclosures
I am the Owner of Fantasy Football Analytics, LLC, which operates https://fantasyfootballanalytics.net.
1.8 Disclaimer
“This material probably won’t win you fantasy football championships. You could take what we learn and apply it to fantasy football and you might become 5 percent more likely to win. Or… Consider the broader relevance of this. You could learn data analysis and figure out ways to apply it to other systems. And you could be making a six-figure salary within the next five years.” – Benjamin Motz, Ph.D.
Here is a video of a Professor Benjamin Motz that describes the value of teaching statistics through the lens of fantasy football: