I need your help!

I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.

You can leave a comment at the bottom of the page/chapter, or open an issue or submit a pull request on GitHub: https://github.com/isaactpetersen/Fantasy-Football-Analytics-Textbook

Alternatively, you can leave an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.

22 Factor Analysis

“All models are wrong, but some are useful.”

— George Box (1979, p. 202)

This chapter provides an overview of factor analysis, which encompasses a range of latent variable modeling approaches, including exploratory and confirmatory factor analysis.

22.1 Getting Started

22.1.1 Load Packages

Code

library("psych")
library("nFactors")
library("lavaan")
library("lavaanPlot")
library("lavaangui")
library("tidyverse")

22.1.2 Load Data

Code

load(file = "./data/player_stats_seasonal.RData")
load(file = "./data/player_stats_weekly.RData")
load(file = "./data/nfl_nextGenStats_weekly.RData")

We created the player_stats_weekly.RData and player_stats_seasonal.RData objects in Section 4.4.3.

22.1.3 Prepare Data

22.1.3.1 Season-Averages

Code

player_stats_seasonal_avgPerGame <- player_stats_seasonal %>% 
  mutate(
    completionsPerGame = completions / games,
    attemptsPerGame = attempts / games,
    passing_yardsPerGame = passing_yards / games,
    passing_tdsPerGame = passing_tds / games,
    passing_interceptionsPerGame = passing_interceptions / games,
    sacks_sufferedPerGame = sacks_suffered / games,
    sack_yards_lostPerGame = sack_yards_lost / games,
    sack_fumblesPerGame = sack_fumbles / games,
    sack_fumbles_lostPerGame = sack_fumbles_lost / games,
    passing_air_yardsPerGame = passing_air_yards / games,
    passing_yards_after_catchPerGame = passing_yards_after_catch / games,
    passing_first_downsPerGame = passing_first_downs / games,
    #passing_epaPerGame = passing_epa / games,
    #passing_cpoePerGame = passing_cpoe / games,
    passing_2pt_conversionsPerGame = passing_2pt_conversions / games,
    #pacrPerGame = pacr / games
    pass_40_ydsPerGame = pass_40_yds / games,
    pass_incPerGame = pass_inc / games,
    #pass_comp_pctPerGame = pass_comp_pct / games,
    fumblesPerGame = fumbles / games
  )

nfl_nextGenStats_seasonal <- nfl_nextGenStats_weekly %>% 
  filter(season_type == "REG") %>% 
  select(-any_of(c("week","season_type","player_display_name","player_position","team_abbr","player_first_name","player_last_name","player_jersey_number","player_short_name"))) %>% 
  group_by(player_gsis_id, season) %>% 
  summarise(
    across(everything(),
      ~ mean(.x, na.rm = TRUE)),
    .groups = "drop")

22.1.3.2 Merge Data

Code

dataMerged <- full_join(
  player_stats_seasonal_avgPerGame %>% select(-any_of("week")),
  nfl_nextGenStats_seasonal %>% select(-any_of(c("player_display_name","completions","attempts","receptions","targets"))),
  by = c("player_id" = "player_gsis_id","season"),
)

22.1.3.3 Specify Variables

Code

faVars <- c(
 "completionsPerGame","attemptsPerGame","passing_yardsPerGame","passing_tdsPerGame",
 "passing_air_yardsPerGame","passing_yards_after_catchPerGame","passing_first_downsPerGame",
 "avg_completed_air_yards","avg_intended_air_yards","aggressiveness","max_completed_air_distance",
 "avg_air_distance","max_air_distance","avg_air_yards_to_sticks","passing_cpoe","pass_comp_pct",
 "passer_rating","completion_percentage_above_expectation"
)

22.1.3.4 Subset Data

Code

dataMerged_subset <- dataMerged %>% 
  filter(!is.na(player_id)) %>% 
  filter(position == "QB")

22.1.3.5 Standardize Variables

Standardizing variables is not strictly necessary in factor analysis; however, standardization can be helpful to prevent some variables from having considerably larger variances than others.

Code

dataForFA <- dataMerged_subset
dataForFA_standardized <- dataForFA

dataForFA_standardized[faVars] <- scale(dataForFA_standardized[faVars])

22.2 Overview of Factor Analysis

Factor analysis involves the estimation of latent variables. Latent variables are ways of studying and operationalizing theoretical constructs that cannot be directly observed or quantified. Factor analysis is a class of latent variable models that is designated to identify the structure of a measure or set of measures, and ideally, a construct or set of constructs. It aims to identify the optimal latent structure for a group of variables. The goal of factor analysis is to identify simple, parsimonious factors that underlie the “junk” (i.e., scores filled with measurement error) that we observe. That is, similar to principal component analysis, factor analysis can help us perform data reduction—i.e., it can help us reduce down a larger set of variables into a smaller set of factors.

Factor analysis encompasses two general types: confirmatory factor analysis and exploratory factor analysis. Exploratory factor analysis (EFA) is a latent variable modeling approach that is used when the researcher has no a priori hypotheses about how a set of variables is structured. EFA seeks to identify the empirically optimal-fitting model in ways that balance accuracy (i.e., variance accounted for) and parsimony (i.e., simplicity). Confirmatory factor analysis (CFA) is a latent variable modeling approach that is used when a researcher wants to evaluate how well a hypothesized model fits, and the model can be examined in comparison to alternative models. Using a CFA approach, the researcher can pit models representing two theoretical frameworks against each other to see which better accounts for the observed data.

Factor analysis involves observed (manifest) variables and unobserved (latent) factors. Factor analysis assumes that the latent factor influences the manifest variables, and the latent factor therefore reflects the common variance among the variables. A factor model potentially includes factor loadings, residuals (errors or disturbances), intercepts/means, covariances, and regression paths. When depicting a factor analysis model, rectangles represent variables we observe (i.e., manifest variables), and circles represent latent (i.e., unobserved) variables. A regression path indicates a hypothesis that one variable (or factor) influences another, and it is depicted using a single-headed arrow. The standardized regression coefficient (i.e., beta or \(\beta\)) represents the strength of association between the variables or factors. A factor loading is a regression path from a latent factor to an observed (manifest) variable. The standardized factor loading represents the strength of association between the variable and the latent factor, where conceptually, it is intended to reflect the magnitude that the latent factor influences the observed variable. A residual is variance in a variable (or factor) that is unexplained by other variables or factors. A variable’s intercept is the expected value of the variable when the factor(s) (onto which it loads) is equal to zero. A covariance is the unstandardized index of the strength of association between between two variables (or factors), and it is depicted with a double-headed arrow.

Because a covariance is unstandardized, its scale depends on the scale of the variables. The covariance between two variables is the average product of their deviations from their respective means, as in Equation 10.1. The covariance of a variable with itself is equivalent to its variance, as in Equation 10.2. By contrast, a correlation is a standardized index of the strength of association between two variables. Because a correlation is standardized (fixed between [−1,1]), its scale does not depend on the scales of the variables. Because a covariance is unstandardized, its scale depends on the scale of the variables. A covariance path between two variables represents omitted shared cause(s) of the variables. For instance, if you depict a covariance path between two variables, it means that there is a shared cause of the two variables that is omitted from the model (for instance, if the common cause is not known or was not assessed).

In factor analysis, the relation between an indicator (\(\text{X}\)) and its underlying latent factor(s) (\(\text{F}\)) can be represented with a regression formula as in Equation 22.1:

\[ X = \lambda \cdot \text{F} + \text{Item Intercept} + \text{Error Term} \tag{22.1}\]

where:

\(\text{X}\) is the observed value of the indicator
\(\lambda\) is the factor loading, indicating the strength of the association between the indicator and the latent factor(s)
\(\text{F}\) is the person’s value on the latent factor(s)
\(\text{Item Intercept}\) represents the constant term that accounts for the expected value of the indicator when the latent factor(s) are zero
\(\text{Error Term}\) is the residual, indicating the extent of variance in the indicator that is not explained by the latent factor(s)

When the latent factors are uncorrelated, the (standardized) error term for an indicator is calculated as 1 minus the sum of squared standardized factor loadings for a given item (including cross-loadings). A cross-loading is when a variable loads onto more than one latent factor.

Factor analysis is a powerful technique to help identify the factor structure that underlies a measure or construct. However, given the extensive method variance that influences scores on measure, factor analysis (and principal component analysis) tends to extract method factors. Method factors are factors that are related to the methods being assessed rather than the construct of interest. To better estimate construct factors, it is sometimes necessary to estimate both construct and method factors.

22.3 Factor Analysis and Structural Equation Modeling

Factor analysis forms the measurement model component of a structural equation model (SEM). The measurement model is what we settle on as the estimation of each construct before we add the structural component to estimate the relations among latent variables. Basically, in a structural equation model, we add the structural component onto the measurement model. For instance, our measurement model (i.e., based on factor analyis) might be to estimate, from a set of items, three latent factors: usage, aggressiveness, and performance. Our structural model then may examine what processes (e.g., sport drink consumption or sleep) influence these latent factors, how the latent factors influence each other, and what the latent factors influence (e.g., fantasy points). SEM is confirmatory factor analysis with regression paths that specify hypothesized causal relations between the latent variables (the structural component of the model). Exploratory structural equation modeling (ESEM) is a form of SEM that allows for a combination of exploratory factor analysis and confirmatory factor analysis to estimate latent variables and the relations between them.

22.4 Path Diagrams

A key tool when designing a factor analysis or structural equation model is a conceptual depiction of the hypothesized causal processes. A path diagram depicts the hypothesized causal processes that link two or more variables. Path diagrams are an example of a causal diagram and are similar to directed acyclic graphs discussed in Section 13.6. Karch (2025a) provides a tool to create lavaan R syntax from a path analytic diagram: https://lavaangui.org.

In a path analysis diagram, rectangles represent variables we observe, and circles represent latent (i.e., unobserved) variables. Single-headed arrows indicate regression paths, where conceptually, one variable is thought to influence another variable. Double-headed arrows indicate covariance paths, where conceptually, two variables are associated for some unknown reason (i.e., an omitted shared cause).

22.5 Decisions in Factor Analysis

There are five primary decisions to make in factor analysis:

what variables to include in the model and how to scale them
method of factor extraction: whether to use exploratory or confirmatory factor analysis
if using exploratory factor analysis, whether and how to rotate factors
how many factors to retain (and what variables load onto which factors)
how to interpret and use the factors

The answer you get can differ highly depending on the decisions you make. Below, we provide guidance for each of these decisions.

22.5.1 1. Variables to Include and their Scaling

The first decision when conducting a factor analysis is which variables to include and the scaling of those variables. What factors you extract can differ widely depending on what variables you include in the analysis. For example, if you include many variables from the same source (e.g., self-report), it is possible that you will extract a factor that represents the common variance among the variables from that source (i.e., the self-reported variables). This would be considered a method factor, which works against the goal of estimating latent factors that represent the constructs of interest (as opposed to the measurement methods used to estimate those constructs).

An additional consideration is the scaling of the variables: whether to use raw variables, standardized variables, or dividing some variables by a constant to make the variables’ variances more similar. Before performing a principal component analysis (PCA), it is generally important to ensure that the variables included in the PCA are on the same scale. PCA seeks to identify components that explain variance in the data, so if the variables are not on the same scale, some variables may contribute considerably more variance than others. A common way of ensuring that variables are on the same scale is to standardize them using, for example, z scores that have a mean of zero and standard deviation of one. By contrast, factor analysis can better accommodate variables that are on different scales.

22.5.2 2. Method of Factor Extraction

22.5.2.1 Exploratory Factor Analysis

Exploratory factor analysis (EFA) is used if you have no a priori hypotheses about the factor structure of the model, but you would like to understand the latent variables represented by your items.

EFA is partly induced from the data. You feed in the data and let the program build the factor model. You can set some parameters going in, including how to extract or rotate the factors. The factors are extracted from the data without specifying the number and pattern of loadings between the items and the latent factors (Bollen, 2002). All cross-loadings are freely estimated.

22.5.2.2 Confirmatory Factor Analysis

Confirmatory factor analysis (CFA) is used to (dis)confirm a priori hypotheses about the factor structure of the model. CFA is a test of the hypothesis. In CFA, you specify the model and ask how well this model represents the data. The researcher specifies the number, meaning, associations, and pattern of free parameters in the factor loading matrix (Bollen, 2002). A key advantage of CFA is the ability to directly compare alternative models (i.e., factor structures), which is valuable for theory testing (Strauss & Smith, 2009). For instance, you could use CFA to test whether the variance in several measures’ scores is best explained with one factor or two factors. In CFA, cross-loadings are not estimated unless the researcher specifies them.

22.5.3 3. Factor Rotation

When using EFA or principal component analysis, an important step is, possibly, to rotate the factors to make them more interpretable and simple, which is the whole goal. To interpret the results of a factor analysis, we examine the factor matrix. The columns refer to the different factors; the rows refer to the different observed variables. The cells in the table are the factor loadings—they are basically the correlation between the variable and the factor. Our goal is to achieve a model with simple structure because it is easily interpretable. Simple structure means that every variable loads perfectly on one and only one factor, as operationalized by a matrix of factor loadings with values of one and zero and nothing else. An example of a factor matrix that follows simple structure is depicted in Figure 22.1.

Example of a Factor Matrix That Follows Simple Structure. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.1: Example of a Factor Matrix That Follows Simple Structure. From Petersen (2024) and Petersen (2025).

An example of a factor analysis model that follows simple structure is depicted in Figure 22.2. Each variable loads onto one and only one factor, which makes it easy to interpret the meaning of each factor, because a given factor represents the common variance among the items that load onto it.

Example of a Factor Analysis Model That Follows Simple Structure. 'INT' = internalizing problems; 'EXT' = externalizing problems; 'TD' = thought-disordered problems. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.2: Example of a Factor Analysis Model That Follows Simple Structure. ‘INT’ = internalizing problems; ‘EXT’ = externalizing problems; ‘TD’ = thought-disordered problems. From Petersen (2024) and Petersen (2025).

However, pure simple structure only occurs in simulations, not in real-life data. In reality, our unrotated factor analysis model might look like the model in Figure 22.3. In this example, the factor analysis model does not show simple structure because the items have cross-loadings—that is, the items load onto more than one factor. The cross-loadings make it difficult to interpret the factors, because all of the items load onto all of the factors, so the factors are not very distinct from each other, which makes it difficult to interpret what the factors mean.

Example of a Factor Analysis Model That Does Not Follow Simple Structure. 'INT' = internalizing problems; 'EXT' = externalizing problems; 'TD' = thought-disordered problems. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.3: Example of a Factor Analysis Model That Does Not Follow Simple Structure. ‘INT’ = internalizing problems; ‘EXT’ = externalizing problems; ‘TD’ = thought-disordered problems. From Petersen (2024) and Petersen (2025).

As a result of the challenges of interpretability caused by cross-loadings, factor rotations are often performed. An example of an unrotated factor matrix is in Figure 22.4.

Example of a Factor Matrix. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.4: Example of a Factor Matrix. From Petersen (2024) and Petersen (2025).

In the example factor matrix in Figure 22.5, the factor analysis is not very helpful—it tells us very little because it did not distinguish between the two factors. The variables have similar loadings on Factor 1 and Factor 2. An example of a unrotated factor solution is in Figure 22.5. In the figure, all of the variables are in the midst of the quadrants—they are not on the factors’ axes. Thus, the factors are not very informative.

Example of an Unrotated Factor Solution. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.5: Example of an Unrotated Factor Solution. From Petersen (2024) and Petersen (2025).

As a result, to improve the interpretability of the factor analysis, we can do what is called rotation. Rotation leverages the idea that there are infinite solutions to the factor analysis model that fit equally well. Rotation involves changing the orientation of the factors by changing the axes so that variables end up with very high (close to one or negative one) or very low (close to zero) loadings, so that it is clear which factors include which variables. That is, rotation rescales the factors and tries to identify the ideal solution (factor) for each variable. It searches for simple structure and keeps searching until it finds a minimum. After rotation, if the rotation was successful for imposing simple structure, each factor will have loadings close to one (or negative one) for some variables and close to zero for other variables. The goal of factor rotation is to achieve simple structure, to help make it easier to interpret the meaning of the factors.

To perform factor rotation, orthogonal rotations are often used. Orthogonal rotations make the rotated factors uncorrelated. An example of a commonly used orthogonal rotation is varimax rotation. Varimax rotation maximizes the sum of the variance of the squared loadings (i.e., so that items have either a very high or very low loading on a factor) and yields axes with a 90-degree angle.

An example of a factor matrix following an orthogonal rotation is depicted in Figure 22.6. An example of a factor solution following an orthogonal rotation is depicted in Figure 22.7.

Example of a Rotated Factor Matrix. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.6: Example of a Rotated Factor Matrix. From Petersen (2024) and Petersen (2025).

Example of a Rotated Factor Solution. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.7: Example of a Rotated Factor Solution. From Petersen (2024) and Petersen (2025).

An example of a factor matrix from SPSS following an orthogonal rotation is depicted in Figure 22.8.

Figure 22.8: Example of a Rotated Factor Matrix From SPSS.

An example of a factor structure from an orthogonal rotation is in Figure 22.9.

Example of a Factor Structure From an Orthogonal Rotation. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.9: Example of a Factor Structure From an Orthogonal Rotation. From Petersen (2024) and Petersen (2025).

Sometimes, however, the two factors and their constituent variables may be correlated. Examples of two correlated factors may be depression and anxiety. When the two factors are correlated in reality, if we make them uncorrelated, this would result in an inaccurate model. Oblique rotation allows for factors to be correlated and yields axes with less an angle of less than 90 degrees. However, if the factors have low correlation (e.g., .2 or less), you can likely continue with orthogonal rotation. Nevertheless, just because an oblique rotation allows for correlated factors does not mean that the factors will be correlated, so oblique rotation provides greater flexibility than orthogonal rotation. An example of a factor structure from an oblique rotation is in Figure 22.10. Results from an oblique rotation are more complicated than orthogonal rotation—they provide lots of output and are more complicated to interpret. In addition, oblique rotation might not yield a smooth answer if you have a relatively small sample size.

Example of a Factor Structure From an Oblique Rotation. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.10: Example of a Factor Structure From an Oblique Rotation. From Petersen (2024) and Petersen (2025).

As an example of rotation based on interpretability, consider the Five-Factor Model of Personality (the Big Five), which goes by the acronym, OCEAN: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Although the five factors of personality are somewhat correlated, we can use rotation to ensure they are maximally independent. Upon rotation, extraversion and neuroticism are essentially uncorrelated, as depicted in Figure 22.11. The other pole of extraversion is intraversion and the other pole of neuroticism might be emotional stability or calmness.

Example of a Factor Rotation of Neuroticism and Extraversion. From @Petersen2024a and @PetersenPrinciplesPsychAssessment. — Figure 22.11: Example of a Factor Rotation of Neuroticism and Extraversion. From Petersen (2024) and Petersen (2025).

Simple structure is achieved when each variable loads highly onto as few factors as possible (i.e., each item has only one significant or primary loading). Oftentimes this is not the case, so we choose our rotation method in order to decide if the factors can be correlated (an oblique rotation) or if the factors will be uncorrelated (an orthogonal rotation). If the factors are not correlated with each other, use an orthogonal rotation. The correlation between an item and a factor is a factor loading, which is simply a way to ask how much a variable is correlated with the underlying factor. However, its interpretation is more complicated if there are correlated factors!

An orthogonal rotation (e.g., varimax) can help with simplicity of interpretation because it seeks to yield simple structure without cross-loadings. Cross-loadings are instances where a variable loads onto multiple factors. My recommendation would always be to use an orthogonal rotation if you have reason to believe that finding simple structure in your data is possible; otherwise, the factors are extremely difficult to interpret—what exactly does a cross-loading even mean? However, you should always try an oblique rotation, too, to see how strongly the factors are correlated. Examples of oblique rotations include oblimin and promax.

22.5.4 4. Determining the Number of Factors to Retain

A goal of factor analysis and principal component analysis is simplification or parsimony, while still explaining as much variance as possible. The hope is that you can have fewer factors that explain the associations between the variables than the number of observed variables. It does not make sense to replace 18 variables with 18 latent factors because that would not result in any simplification. But how do you decide on the number of factors?

There are a number of criteria that one can use to help determine how many factors/components to keep:

Kaiser-Guttman criterion: factors with eigenvalues greater than zero
- or, for principal component analysis, components with eigenvalues greater than 1
Cattell’s scree test: the “elbow” in a scree plot minus one; sometimes operationalized with optimal coordinates (OC) or the acceleration factor (AF)
Parallel analysis: factors that explain more variance than randomly simulated data
Very simple structure (VSS) criterion: larger is better
Velicer’s minimum average partial (MAP) test: smaller is better
Akaike information criterion (AIC): smaller is better
Bayesian information criterion (BIC): smaller is better
Sample size-adjusted BIC (SABIC): smaller is better
Root mean square error of approximation (RMSEA): smaller is better
Chi-square difference test: smaller is better; a significant test indicates that the more complex model is significantly better fitting than the less complex model
Standardized root mean square residual (SRMR): smaller is better
Comparative Fit Index (CFI): larger is better
Tucker Lewis Index (TLI): larger is better

There is not necessarily a “correct” criterion to use in determining how many factors to keep, so it is generally recommended that researchers use multiple criteria in combination with theory and interpretability.

A scree plot provides lots of information. A scree plot has the factor number on the x-axis and the eigenvalue on the y-axis. The eigenvalue is the variance accounted for by a factor; when using a varimax (orthogonal) rotation, an eigenvalue (or factor variance) is calculated as the sum of squared standardized factor (or component) loadings on that factor. An example of a scree plot is in Figure 22.12.

The total variance is equal to the number of variables you have, so one eigenvalue is approximately one variable’s worth of variance. The first factor accounts for the most variance, the second factor accounts for the second-most variance, and so on. The more factors you add, the less variance is explained by the additional factor.

One criterion for how many factors to keep is the Kaiser-Guttman criterion. According to the Kaiser-Guttman criterion, you should keep any factors whose eigenvalue is greater than 1. That is, for the sake of simplicity, parsimony, and data reduction, you should take any factors that explain more than a single variable would explain. According to the Kaiser-Guttman criterion, we would keep three factors from Figure 22.12 that have eigenvalues greater than 1. The default in SPSS is to retain factors with eigenvalues greater than 1. However, keeping factors whose eigenvalue is greater than 1 is not the most correct rule. If you let SPSS do this, you may get many factors with eigenvalues around 1 (e.g., factors with an eigenvalue ~ 1.0001) that are not adding so much that it is worth the added complexity. The Kaiser-Guttman criterion usually results in keeping too many factors. Factors with small eigenvalues around 1 could reflect error shared across variables. For instance, factors with small eigenvalues could reflect method variance (i.e., method factor), such as a self-report factor that turns up as a factor in factor analysis, but that may be useless to you as a conceptual factor of a construct of interest.

Another criterion is Cattell’s scree test, which involves selecting the number of factors from looking at the scree plot. “Scree” refers to the rubble of stones at the bottom of a mountain. According to Cattell’s scree test, you should keep the factors before the last steep drop in eigenvalues—i.e., the factors before the rubble, where the slope approaches zero. The beginning of the scree (or rubble), where the slope approaches zero, is called the “elbow” of a scree plot. Using Cattell’s scree test, you retain the number of factors that explain the most variance prior to the explained variance drop-off, because, ultimately, you want to include only as many factors in which you gain substantially more by the inclusion of these factors. That is, you would keep the number of factors at the elbow of the scree plot minus one. If the last steep drop occurs from Factor 4 to Factor 5 and the elbow is at Factor 5, we would keep four factors. In Figure 22.12, the last steep drop in eigenvalues occurs from Factor 3 to Factor 4; the elbow of the scree plot occurs at Factor 4. We would keep the number of factors at the elbow minus one. Thus, using Cattell’s scree test, we would keep three factors based on Figure 22.12.

There are more sophisticated ways of using a scree plot, but they usually end up at a similar decision. Examples of more sophisticated tests include parallel analysis and very simple structure (VSS) plots. In a parallel analysis, you examine where the eigenvalues from observed data and random data converge, so you do not retain a factor that explains less variance than would be expected by random chance. Using the VSS criterion, the optimal number of factors to retain is the number of factors that maximizes the VSS criterion (Revelle & Rocklin, 1979). The VSS criterion is evaluated with models in which factor loadings for a given item that are less than the maximum factor loading for that item are suppressed to zero, thus forcing simple structure (i.e., no cross-loadings). The goal is finding a factor structure with interpretability so that factors are clearly distinguishable. Thus, we want to identify the number of factors with the highest VSS criterion.

In general, my recommendation is to use Cattell’s scree test, and then test the factor solutions with plus or minus one factor, in addition to examining model fit. You should never accept factors with eigenvalues less than zero (or components from PCA with eigenvalues less than one), because they are likely to be largely composed of error. If you are using maximum likelihood factor analysis, you can compare the fit of various models with model fit criteria to see which model fits best for its parsimony. A model will always fit better when you add additional parameters or factors, so you examine if there is significant improvement in model fit when adding the additional factor—that is, we keep adding complexity until additional complexity does not buy us much. Always try a factor solution that is one less and one more than suggested by Cattell’s scree test to buffer your final solution because the purpose of factor analysis is to explain things and to have interpretability. Even if all rules or indicators suggest to keep X number of factors, maybe \(\pm\) one factor helps clarify things. Even though factor analysis is empirical, theory and interpretatability should also inform decisions.

22.5.4.1 Model Fit Indices

In factor analysis, we fit a model to observed data, or to the variance-covariance matrix, and we evaluate the degree of model misfit. That is, fit indices evaluate how likely it is that a given causal model gave rise to the observed data. Various model fit indices can be used for evaluating how well a model fits the data and for comparing the fit of two competing models. Fit indices known as absolute fit indices compare whether the model fits better than the best-possible fitting model (i.e., a saturated model). Examples of absolute fit indices include the chi-square test, root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR).

The chi-square test evaluates whether the model has a significant degree of misfit relative to the best-possible fitting model (a saturated model that fits as many parameters as possible; i.e., as many parameters as there are degrees of freedom); the null hypothesis of a chi-square test is that there is no difference between the predicted data (i.e., the data that would be observed if the model were true) and the observed data. Thus, a non-significant chi-square test indicates good model fit. However, because the null hypothesis of the chi-square test is that the model-implied covariance matrix is exactly equal to the observed covariance matrix (i.e., a model of perfect fit), this may be an unrealistic comparison. Models are simplifications of reality, and our models are virtually never expected to be a perfect description of reality. Thus, we would say a model is “useful” and partially validated if “it helps us to understand the relation between variables and does a ‘reasonable’ job of matching the data…A perfect fit may be an inappropriate standard, and a high chi-square estimate may indicate what we already know—that the hypothesized model holds approximately, not perfectly.” (Bollen, 1989, p. 268). The power of the chi-square test depends on sample size, and a large sample will likely detect small differences as significantly worse than the best-possible fitting model (Bollen, 1989).

RMSEA is an index of absolute fit. Lower values indicate better fit.

SRMR is an index of absolute fit with no penalty for model complexity. Lower values indicate better fit.

There are also various fit indices known as incremental, comparative, or relative fit indices that compare whether the model fits better than the worst-possible fitting model (i.e., a “baseline” or “null” model). Incremental fit indices include a chi-square difference test, the comparative fit index (CFI), and the Tucker-Lewis index (TLI). Unlike the chi-square test comparing the model to the best-possible fitting model, a significant chi-square test of the relative fit index indicates better fit—i.e., that the model fits better than the worst-possible fitting model.

CFI is another relative fit index that compares the model to the worst-possible fitting model. Higher values indicate better fit.

TLI is another relative fit index. Higher values indicate better fit.

Parsimony fit include fit indices that use information criteria fit indices, including the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC). BIC penalizes model complexity more so than AIC. Lower AIC and BIC values indicate better fit.

Chi-square difference tests and CFI can be used to compare two nested models. AIC and BIC can be used to compare two non-nested models.

Criteria for acceptable fit and good fit of SEM models are in Table 22.1. In addition, dynamic fit indexes have been proposed based on simulation to identify fit index cutoffs that are tailored to the characteristics of the specific model and data (McNeish & Wolf, 2023); dynamic fit indexes are available via the dynamic package (Wolf & McNeish, 2022) or with a webapp.

Table 22.1: Criteria for Acceptable and Good Fit of Factor Analysis and Structural Equation Models Based on Fit Indices.

SEM Fit Index	Acceptable Fit	Good Fit
RMSEA	\(\leq\) .08	\(\leq\) .05
CFI	\(\geq\) .90	\(\geq\) .95
TLI	\(\geq\) .90	\(\geq\) .95
SRMR	\(\leq\) .10	\(\leq\) .08

However, good model fit does not necessarily indicate a true model.

In addition to global fit indices, it can also be helpful to examine evidence of local fit, such as the residual covariance matrix. The residual covariance matrix represents the difference between the observed covariance matrix and the model-implied covariance matrix (the observed covariance matrix minus the model-implied covariance matrix). These difference values are called covariance residuals. Standardizing the covariance matrix by converting each to a correlation matrix can be helpful for interpreting the magnitude of any local misfit. This is known as a residual correlation matrix, which is composed of correlation residuals. Correlation residuals greater than |.10| are possible evidence for poor local fit (Kline, 2023). If a correlation residual is positive, it suggests that the model underpredicts the observed association between the two variables (i.e., the observed covariance is greater than the model-implied covariance). If a correlation residual is negative, it suggests that the model overpredicts their observed association between the two variables (i.e., the observed covariance is smaller than the model-implied covariance). If the two variables are connected by only indirect pathways, it may be helpful to respecify the model with direct pathways between the two variables, such as a direct effect (i.e., regression path) or a covariance path. For guidance on evaluating local fit, see Kline (2024).

22.5.5 5. Interpreting and Using Latent Factors

The next step is interpreting the model and latent factors. One data matrix can lead to many different (correct) models—you must choose one based on the factor structure and theory. Use theory to interpret the model and label the factors. In latent variable models, factors have meaning. You can use them as predictors, mediators, moderators, or outcomes. When possible, it is preferable to use the factors by examining their associations with other variables in the same model. You can extract factor scores, if necessary, for use in other analyses; however, it is preferable to examine the associations between factors and other variables in the same model if possible. And, using latent factors helps disattenuate associations for measurement error, to identify what the association is between variables when removing random measurement error.

22.6 Example of Exploratory Factor Analysis

We generated the scree plot in Figure 22.13 using the psych::fa.parallel() function of the psych package (Revelle, 2025). The optimal coordinates and the acceleration factor attempt to operationalize the Cattell scree test: i.e., the “elbow” of the scree plot (Ruscio & Roche, 2012). The optimal coordinators factor is quantified using a series of linear equations to determine whether observed eigenvalues exceed the predicted values. The acceleration factor is quantified using the acceleration of the curve, that is, the second derivative. The Kaiser-Guttman rule states to keep principal components whose eigenvalues are greater than 1. However, for exploratory factor analysis (as opposed to PCA), the criterion is to keep the factors whose eigenvalues are greater than zero (i.e., not the factors whose eigenvalues are greater than 1) (Dinno, 2014).

The number of factors to keep would depend on which criteria one uses. Based on the rule to keep factors whose eigenvalues are greater than zero and based on the parallel test, we would keep five factors. However, based on the Cattell scree test (the “elbow” of the screen plot minus one), we would keep three factors. If using the optimal coordinates, we would keep eight factors; if using the acceleration factor, we would keep one factor. Therefore, interpretability of the factors would be important for deciding how many factors to keep.

Code

psych::fa.parallel(
  x = dataForFA[faVars],
  fm = "minres", # errors out when using "ml"
  fa = "fa"
)

Parallel analysis suggests that the number of factors =  5  and the number of components =  NA

Figure 22.13: Scree Plot: With Comparisons to Simulated and Resampled Data.

We generated the scree plot in Figure 22.14 using the nFactors::nScree() and nFactors::plotnScree() functions of the nFactors package (Raiche & Magis, 2025).

Code

#screeDataEFA <- nFactors::nScree(
#  x = cor( # throws error with correlation matrix, so use covariance instead (below)
#    dataForFA[faVars],
#    use = "pairwise.complete.obs"),
#  model = "factors")

screeDataEFA <- nFactors::nScree(
  x = cov(
    dataForFA[faVars],
    use = "pairwise.complete.obs"),
  cor = FALSE,
  model = "factors")

nFactors::plotnScree(screeDataEFA)

Figure 22.14: Scree Plot with Parallel Analysis.

We generated the very simple structure (VSS) plots in Figures 22.15 and 22.16 using the psych::vss() and psych::nfactors() functions of the psych package (Revelle, 2025). In addition to VSS plots, the output also provides additional criteria by which to determine the optimal number of factors, each for which lower values are better, including the Velicer minimum average partial (MAP) test, the Bayesian information criterion (BIC), the sample size-adjusted BIC (SABIC), and the root mean square error of approximation (RMSEA). Depending on the criterion, the optimal number of factors extracted varies between 2 and 8 factors.

Code

psych::vss(
  dataForFA[faVars],
  rotate = "oblimin",
  fm = "minres") # errors out when using "mle"


Very Simple Structure
Call: psych::vss(x = dataForFA[faVars], rotate = "oblimin", fm = "minres")
VSS complexity 1 achieves a maximimum of 0.79  with  2  factors
VSS complexity 2 achieves a maximimum of 0.87  with  2  factors

The Velicer MAP achieves a minimum of 0.13  with  3  factors 
BIC achieves a minimum of  121337.1  with  8  factors
Sample Size adjusted BIC achieves a minimum of  121454.6  with  8  factors

Statistics by number of factors 
  vss1 vss2  map dof  chisq prob sqresid  fit RMSEA    BIC  SABIC complex
1 0.67 0.00 0.19 135 150953    0    26.4 0.67  0.75 149927 150356     1.0
2 0.79 0.87 0.16 118 137866    0    10.5 0.87  0.76 136969 137344     1.2
3 0.77 0.86 0.13 102    NaN  NaN     8.6 0.89    NA     NA     NA     1.2
4 0.77 0.86 0.14  87 132838    0     8.0 0.90  0.87 132176 132453     1.3
5 0.73 0.81 0.30  73 150832    0    10.0 0.87  1.02 150277 150509     1.3
6 0.69 0.79 0.45  60    NaN  NaN    11.6 0.85    NA     NA     NA     1.4
7 0.62 0.76 1.28  48    NaN  NaN    13.1 0.84    NA     NA     NA     1.6
8 0.48 0.68  NaN  37 121618    0    17.6 0.78  1.28 121337 121455     1.6
  eChisq  SRMR eCRMS  eBIC
1  35900 0.242 0.258 34874
2  10587 0.132 0.150  9690
3   1660 0.052 0.064   885
4    978 0.040 0.053   317
5    495 0.028 0.041   -60
6    240 0.020 0.032  -216
7    116 0.014 0.025  -248
8     39 0.008 0.016  -242

Figure 22.15: Very Simple Structure Plot.

Code

psych::nfactors(
  dataForFA[faVars],
  rotate = "oblimin",
  fm = "minres") # errors out when using "ml"


Number of factors
Call: vss(x = x, n = n, rotate = rotate, diagonal = diagonal, fm = fm, 
    n.obs = n.obs, plot = FALSE, title = title, use = use, cor = cor)
VSS complexity 1 achieves a maximimum of 0.79  with  2  factors
VSS complexity 2 achieves a maximimum of 0.87  with  2  factors
The Velicer MAP achieves a minimum of 0.13  with  3  factors 
Empirical BIC achieves a minimum of  -248.29  with  7  factors
Sample Size adjusted BIC achieves a minimum of  107165.5  with  9  factors

Statistics by number of factors 
   vss1 vss2  map dof  chisq prob sqresid  fit RMSEA    BIC  SABIC complex
1  0.67 0.00 0.19 135 150953    0    26.4 0.67  0.75 149927 150356     1.0
2  0.79 0.87 0.16 118 137866    0    10.5 0.87  0.76 136969 137344     1.2
3  0.77 0.86 0.13 102    NaN  NaN     8.6 0.89    NA     NA     NA     1.2
4  0.77 0.86 0.14  87 132838    0     8.0 0.90  0.87 132176 132453     1.3
5  0.73 0.81 0.30  73 150832    0    10.0 0.87  1.02 150277 150509     1.3
6  0.69 0.79 0.45  60    NaN  NaN    11.6 0.85    NA     NA     NA     1.4
7  0.62 0.76 1.28  48    NaN  NaN    13.1 0.84    NA     NA     NA     1.6
8  0.48 0.68  NaN  37 121618    0    17.6 0.78  1.28 121337 121455     1.6
9  0.48 0.67  NaN  27 107285    0    18.7 0.76  1.41 107080 107166     1.6
10 0.33 0.49  NaN  18 135089    0    27.9 0.65  1.94 134953 135010     1.8
11 0.32 0.46  NaN  10    NaN  NaN    31.3 0.61    NA     NA     NA     1.8
12 0.37 0.46  NaN   3    NaN  NaN    31.0 0.61    NA     NA     NA     1.6
13 0.42 0.48  NaN  -3    NaN   NA    29.7 0.63    NA     NA     NA     1.7
14 0.42 0.45  NaN  -8    NaN   NA    30.7 0.61    NA     NA     NA     1.9
15 0.25 0.33  NaN -12    NaN   NA    40.2 0.49    NA     NA     NA     2.3
16 0.25 0.33  NaN -15    NaN   NA    40.2 0.49    NA     NA     NA     2.3
17 0.25 0.33  NaN -17    NaN   NA    40.2 0.49    NA     NA     NA     2.3
18 0.25 0.33   NA -18    NaN   NA    40.2 0.49    NA     NA     NA     2.3
    eChisq   SRMR eCRMS  eBIC
1  35900.3 0.2424 0.258 34874
2  10586.6 0.1316 0.150  9690
3   1660.4 0.0521 0.064   885
4    978.2 0.0400 0.053   317
5    495.2 0.0285 0.041   -60
6    239.8 0.0198 0.032  -216
7    116.5 0.0138 0.025  -248
8     39.3 0.0080 0.016  -242
9     25.3 0.0064 0.015  -180
10    18.8 0.0055 0.016  -118
11    14.7 0.0049 0.019   -61
12    12.3 0.0045 0.032   -11
13    10.3 0.0041    NA    NA
14     9.9 0.0040    NA    NA
15     9.8 0.0040    NA    NA
16     9.8 0.0040    NA    NA
17     9.8 0.0040    NA    NA
18     9.8 0.0040    NA    NA

Figure 22.16: Model Indices by Number of Factors.

We fit EFA models using the lavaan::efa() function of the lavaan package (Rosseel, 2012; Rosseel et al., 2024).

Code

efa_fit <- lavaan::efa(
  data = dataForFA,
  ov.names = faVars,
  nfactors = 1:7,
  rotation = "geomin",
  missing = "ML",
  estimator = "MLR",
  bounds = "standard",
  meanstructure = TRUE,
  em.h1.iter.max = 2000000)

The model fits well according to CFI with 5 or more factors; the model fits well according to RMSEA with 6 or more factors. However, 5+ factors does not represent much of a simplification (relative to the 18 variables included). Moreover, in the model with four factors, only one variable had a significant loading on Factor 1. Thus, even with four factors, one of the factors does not seem to represent the aggregation of multiple variables. For these reasons, and because the “elbow test” for the scree plot suggested three factors, we decided to retain three factors and to see if we could achieve better fit by making additional model modifications (e.g., correlated residuals). Correlated residuals may be necessary, for example, when variables are correlated for reasons other than the latent factors.

Code

summary(efa_fit)

This is lavaan 0.6-20 -- running exploratory factor analysis

  Estimator                                         ML
  Rotation method                       GEOMIN OBLIQUE
  Geomin epsilon                                 0.001
  Rotation algorithm (rstarts)                GPA (30)
  Standardized metric                             TRUE
  Row weights                                     None

  Number of observations                          1997
  Number of missing patterns                         4

Overview models:
                    aic      bic    sabic    chisq  df pvalue   cfi rmsea
  nfactors = 1 124823.3 125125.6 124954.1 9257.081 135      0 0.222 0.568
  nfactors = 2 117633.1 118030.6 117805.1 5228.488 118      0 0.665 0.403
  nfactors = 3 113620.9 114108.0 113831.6 3182.380 102      0 0.730 0.399
  nfactors = 4 110556.7 111127.8 110803.8 1451.175  87      0 0.895 0.287
  nfactors = 5 109508.1 110157.6 109789.1  746.215  73      0 0.994 0.137
  nfactors = 6 109268.7 109991.1 109581.2  502.466  60      0 1.000 0.104
  nfactors = 7 108877.6 109667.1 109219.2  474.060  48      0 1.000 0.000

Eigenvalues correlation matrix:

      ev1       ev2       ev3       ev4       ev5       ev6       ev7       ev8 
  6.62289   6.52671   2.96433   0.53788   0.40121   0.33897   0.21006   0.11017 
      ev9      ev10      ev11      ev12      ev13      ev14      ev15      ev16 
  0.08214   0.05588   0.04133   0.03158   0.02365   0.02080   0.01566   0.00891 
     ev17      ev18 
  0.00583   0.00201 

Number of factors:  1 

Standardized loadings: (* = significant at 1% level)

                                            f1       unique.var   communalities
completionsPerGame                       0.987*           0.027           0.973
attemptsPerGame                          0.974*           0.051           0.949
passing_yardsPerGame                     0.994*           0.012           0.988
passing_tdsPerGame                       0.864*           0.254           0.746
passing_air_yardsPerGame                 0.692*           0.522           0.478
passing_yards_after_catchPerGame         0.771*           0.406           0.594
passing_first_downsPerGame               0.992*           0.016           0.984
avg_completed_air_yards                      .*           0.919           0.081
avg_intended_air_yards                       .            0.977           0.023
aggressiveness                                            0.997           0.003
max_completed_air_distance               0.472*           0.777           0.223
avg_air_distance                                          1.000           0.000
max_air_distance                         0.367*           0.866           0.134
avg_air_yards_to_sticks                      .*           0.955           0.045
passing_cpoe                                 .*           0.915           0.085
pass_comp_pct                                .*           0.914           0.086
passer_rating                            0.627*           0.606           0.394
completion_percentage_above_expectation  0.519*           0.731           0.269

                           f1
Sum of squared loadings 7.057
Proportion of total     1.000
Proportion var          0.392
Cumulative var          0.392

Number of factors:  2 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2       unique.var
completionsPerGame                       0.986*                   0.029
attemptsPerGame                          0.974*       *           0.050
passing_yardsPerGame                     0.996*                   0.008
passing_tdsPerGame                       0.863*                   0.255
passing_air_yardsPerGame                 0.723*  0.695*           0.033
passing_yards_after_catchPerGame         0.748* -0.613*           0.029
passing_first_downsPerGame               0.990*                   0.019
avg_completed_air_yards                          0.965*           0.070
avg_intended_air_yards                        *  0.998*           0.002
aggressiveness                               .   0.785*           0.366
max_completed_air_distance                   .*  0.813*           0.288
avg_air_distance                              *  0.979*           0.032
max_air_distance                             .*  0.850*           0.260
avg_air_yards_to_sticks                          0.991*           0.017
passing_cpoe                             0.317* -0.345*           0.772
pass_comp_pct                                .*                   0.917
passer_rating                            0.614*      .            0.614
completion_percentage_above_expectation  0.487*      .            0.730
                                          communalities
completionsPerGame                                0.971
attemptsPerGame                                   0.950
passing_yardsPerGame                              0.992
passing_tdsPerGame                                0.745
passing_air_yardsPerGame                          0.967
passing_yards_after_catchPerGame                  0.971
passing_first_downsPerGame                        0.981
avg_completed_air_yards                           0.930
avg_intended_air_yards                            0.998
aggressiveness                                    0.634
max_completed_air_distance                        0.712
avg_air_distance                                  0.968
max_air_distance                                  0.740
avg_air_yards_to_sticks                           0.983
passing_cpoe                                      0.228
pass_comp_pct                                     0.083
passer_rating                                     0.386
completion_percentage_above_expectation           0.270

                              f2    f1  total
Sum of sq (obliq) loadings 6.891 6.620 13.511
Proportion of total        0.510 0.490  1.000
Proportion var             0.383 0.368  0.751
Cumulative var             0.383 0.751  0.751

Factor correlations: (* = significant at 1% level)

       f1      f2 
f1  1.000         
f2 -0.038   1.000 

Number of factors:  3 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2      f3       unique.var
completionsPerGame                       0.978*               *           0.025
attemptsPerGame                          0.993*       *       *           0.038
passing_yardsPerGame                     0.987*               *           0.010
passing_tdsPerGame                       0.833*       *      .*           0.257
passing_air_yardsPerGame                 0.791*  0.732*                   0.022
passing_yards_after_catchPerGame         0.697* -0.603*                   0.031
passing_first_downsPerGame               0.979*               *           0.020
avg_completed_air_yards                          0.784*  0.332*           0.043
avg_intended_air_yards                           0.887*      .*           0.002
aggressiveness                                   0.796*                   0.349
max_completed_air_distance                   .*  0.662*  0.379*           0.181
avg_air_distance                              *  0.871*      .*           0.024
max_air_distance                             .*  0.806*      .            0.219
avg_air_yards_to_sticks                          0.861*      .*           0.012
passing_cpoe                                             1.001*           0.021
pass_comp_pct                                   -0.474*  1.055*           0.107
passer_rating                                 *          0.930*           0.090
completion_percentage_above_expectation                  0.964*           0.046
                                          communalities
completionsPerGame                                0.975
attemptsPerGame                                   0.962
passing_yardsPerGame                              0.990
passing_tdsPerGame                                0.743
passing_air_yardsPerGame                          0.978
passing_yards_after_catchPerGame                  0.969
passing_first_downsPerGame                        0.980
avg_completed_air_yards                           0.957
avg_intended_air_yards                            0.998
aggressiveness                                    0.651
max_completed_air_distance                        0.819
avg_air_distance                                  0.976
max_air_distance                                  0.781
avg_air_yards_to_sticks                           0.988
passing_cpoe                                      0.979
pass_comp_pct                                     0.893
passer_rating                                     0.910
completion_percentage_above_expectation           0.954

                              f2    f1    f3  total
Sum of sq (obliq) loadings 6.033 5.746 4.724 16.503
Proportion of total        0.366 0.348 0.286  1.000
Proportion var             0.335 0.319 0.262  0.917
Cumulative var             0.335 0.654 0.917  0.917

Factor correlations: (* = significant at 1% level)

       f1      f2      f3 
f1  1.000                 
f2 -0.145*  1.000         
f3  0.158*  0.446*  1.000 

Number of factors:  4 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2      f3      f4 
completionsPerGame                               0.971*               *
attemptsPerGame                                  1.025*                
passing_yardsPerGame                         .*  0.901*                
passing_tdsPerGame                       0.381*  0.666*                
passing_air_yardsPerGame                         0.821*  0.733*        
passing_yards_after_catchPerGame             .*  0.598* -0.607*        
passing_first_downsPerGame                   .*  0.904*               *
avg_completed_air_yards                      .*          0.960*        
avg_intended_air_yards                                   1.024*        
aggressiveness                                           0.816*        
max_completed_air_distance                   .*      .*  0.777*        
avg_air_distance                                         0.995*        
max_air_distance                                     .*  0.869*        
avg_air_yards_to_sticks                                  1.003*        
passing_cpoe                                                 .   0.917*
pass_comp_pct                                           -0.449*  1.065*
passer_rating                                .*                  0.801*
completion_percentage_above_expectation                      .   0.891*
                                             unique.var   communalities
completionsPerGame                                0.015           0.985
attemptsPerGame                                   0.001           0.999
passing_yardsPerGame                              0.001           0.999
passing_tdsPerGame                                0.206           0.794
passing_air_yardsPerGame                          0.015           0.985
passing_yards_after_catchPerGame                  0.023           0.977
passing_first_downsPerGame                        0.023           0.977
avg_completed_air_yards                           0.067           0.933
avg_intended_air_yards                            0.003           0.997
aggressiveness                                    0.322           0.678
max_completed_air_distance                        0.293           0.707
avg_air_distance                                  0.034           0.966
max_air_distance                                  0.238           0.762
avg_air_yards_to_sticks                           0.018           0.982
passing_cpoe                                      0.037           0.963
pass_comp_pct                                     0.074           0.926
passer_rating                                     0.148           0.852
completion_percentage_above_expectation           0.028           0.972

                              f3    f2    f4    f1  total
Sum of sq (obliq) loadings 6.953 5.369 3.406 0.726 16.454
Proportion of total        0.423 0.326 0.207 0.044  1.000
Proportion var             0.386 0.298 0.189 0.040  0.914
Cumulative var             0.386 0.685 0.874 0.914  0.914

Factor correlations: (* = significant at 1% level)

       f1      f2      f3      f4 
f1  1.000                         
f2  0.391*  1.000                 
f3 -0.006  -0.171*  1.000         
f4  0.279*  0.130*  0.433*  1.000 

Number of factors:  5 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2      f3      f4      f5 
completionsPerGame                       0.952*       *                      .*
attemptsPerGame                          0.979*                       *        
passing_yardsPerGame                     0.864*      .*               *        
passing_tdsPerGame                       0.646*  0.439*                        
passing_air_yardsPerGame                 0.801*          0.760*               *
passing_yards_after_catchPerGame         0.527*  0.301* -0.611*      .*       *
passing_first_downsPerGame               0.878*      .*                       *
avg_completed_air_yards                                  1.007*      .*      . 
avg_intended_air_yards                                   1.005*                
aggressiveness                                       .   0.792*                
max_completed_air_distance                   .*      .   0.734*                
avg_air_distance                                         0.943*      .*        
max_air_distance                             .*          0.662*      .*      . 
avg_air_yards_to_sticks                                  0.974*                
passing_cpoe                                                 .           0.899*
pass_comp_pct                                           -0.302*          1.038*
passer_rating                                    0.320*                  0.758*
completion_percentage_above_expectation                      .*          0.851*
                                             unique.var   communalities
completionsPerGame                                0.008           0.992
attemptsPerGame                                   0.005           0.995
passing_yardsPerGame                              0.003           0.997
passing_tdsPerGame                                0.189           0.811
passing_air_yardsPerGame                          0.016           0.984
passing_yards_after_catchPerGame                  0.000           1.000
passing_first_downsPerGame                        0.018           0.982
avg_completed_air_yards                           0.037           0.963
avg_intended_air_yards                            0.002           0.998
aggressiveness                                    0.432           0.568
max_completed_air_distance                        0.343           0.657
avg_air_distance                                  0.048           0.952
max_air_distance                                  0.300           0.700
avg_air_yards_to_sticks                           0.027           0.973
passing_cpoe                                      0.017           0.983
pass_comp_pct                                     0.100           0.900
passer_rating                                     0.136           0.864
completion_percentage_above_expectation           0.039           0.961

                              f3    f1    f5    f2    f4  total
Sum of sq (obliq) loadings 6.505 5.147 3.356 1.000 0.272 16.280
Proportion of total        0.400 0.316 0.206 0.061 0.017  1.000
Proportion var             0.361 0.286 0.186 0.056 0.015  0.904
Cumulative var             0.361 0.647 0.834 0.889 0.904  0.904

Factor correlations: (* = significant at 1% level)

       f1      f2      f3      f4      f5 
f1  1.000                                 
f2  0.371*  1.000                         
f3 -0.121*  0.090   1.000                 
f4  0.189   0.090   0.020   1.000         
f5  0.121*  0.381*  0.446*  0.278   1.000 

Number of factors:  6 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2      f3      f4      f5 
completionsPerGame                               0.968*                        
attemptsPerGame                                  0.985*                        
passing_yardsPerGame                         .   0.879*                        
passing_tdsPerGame                       0.387*  0.678*                        
passing_air_yardsPerGame                         0.843*  0.758                 
passing_yards_after_catchPerGame             .   0.509* -0.455   0.306         
passing_first_downsPerGame                   .   0.899*                        
avg_completed_air_yards                                      .       .   0.647 
avg_intended_air_yards                                   0.885*              . 
aggressiveness                               .           0.499               . 
max_completed_air_distance                           .                   0.646*
avg_air_distance                                         0.876*              . 
max_air_distance                                     .   0.790       .         
avg_air_yards_to_sticks                                  0.799*              . 
passing_cpoe                                                 .                 
pass_comp_pct                                                                . 
passer_rating                                .*                                
completion_percentage_above_expectation                      .                 
                                            f6       unique.var   communalities
completionsPerGame                                        0.010           0.990
attemptsPerGame                                           0.000           1.000
passing_yardsPerGame                                      0.002           0.998
passing_tdsPerGame                                        0.199           0.801
passing_air_yardsPerGame                     .            0.012           0.988
passing_yards_after_catchPerGame                          0.000           1.000
passing_first_downsPerGame                                0.019           0.981
avg_completed_air_yards                                   0.000           1.000
avg_intended_air_yards                                    0.002           0.998
aggressiveness                                            0.445           0.555
max_completed_air_distance                   .            0.340           0.660
avg_air_distance                                          0.048           0.952
max_air_distance                             .            0.291           0.709
avg_air_yards_to_sticks                                   0.027           0.973
passing_cpoe                             0.900*           0.018           0.982
pass_comp_pct                            1.006*           0.094           0.906
passer_rating                            0.787*           0.146           0.854
completion_percentage_above_expectation  0.893*           0.038           0.962

                              f2    f3    f6    f5    f1    f4  total
Sum of sq (obliq) loadings 5.291 4.712 3.522 1.659 0.717 0.407 16.309
Proportion of total        0.324 0.289 0.216 0.102 0.044 0.025  1.000
Proportion var             0.294 0.262 0.196 0.092 0.040 0.023  0.906
Cumulative var             0.294 0.556 0.751 0.844 0.883 0.906  0.906

Factor correlations: (* = significant at 1% level)

       f1      f2      f3      f4      f5      f6 
f1  1.000                                         
f2  0.390   1.000                                 
f3  0.008  -0.129   1.000                         
f4  0.178   0.344  -0.346   1.000                 
f5  0.192   0.027   0.870* -0.278   1.000         
f6  0.385   0.173*  0.471   0.161   0.371   1.000 

Number of factors:  7 

Standardized loadings: (* = significant at 1% level)

                                            f1      f2      f3      f4      f5 
completionsPerGame                           .   0.948*                        
attemptsPerGame                              .*  0.924*                        
passing_yardsPerGame                             0.977*      .                 
passing_tdsPerGame                           .   0.906*                      . 
passing_air_yardsPerGame                         0.656*  0.511*  0.345*        
passing_yards_after_catchPerGame                 0.770*         -0.526*        
passing_first_downsPerGame                       0.998*                        
avg_completed_air_yards                                          0.523   0.510*
avg_intended_air_yards                                   0.507*          0.566*
aggressiveness                               .                           0.786*
max_completed_air_distance                                       0.568         
avg_air_distance                                     .*  0.553*          0.499*
max_air_distance                                         0.673*                
avg_air_yards_to_sticks                                  0.418*          0.693*
passing_cpoe                                 .                                 
pass_comp_pct                                                           -0.540*
passer_rating                                        .                         
completion_percentage_above_expectation      .                                 
                                            f6      f7       unique.var
completionsPerGame                                   .*           0.010
attemptsPerGame                                                   0.003
passing_yardsPerGame                                              0.002
passing_tdsPerGame                                                0.194
passing_air_yardsPerGame                                          0.008
passing_yards_after_catchPerGame                     .*           0.000
passing_first_downsPerGame                           .            0.017
avg_completed_air_yards                      .                    0.025
avg_intended_air_yards                                            0.001
aggressiveness                                                    0.429
max_completed_air_distance               0.933*                   0.000
avg_air_distance                                                  0.047
max_air_distance                         0.405*                   0.162
avg_air_yards_to_sticks                                           0.028
passing_cpoe                                     0.978*           0.019
pass_comp_pct                                    1.002*           0.041
passer_rating                                    0.812*           0.142
completion_percentage_above_expectation          1.007*           0.046
                                          communalities
completionsPerGame                                0.990
attemptsPerGame                                   0.997
passing_yardsPerGame                              0.998
passing_tdsPerGame                                0.806
passing_air_yardsPerGame                          0.992
passing_yards_after_catchPerGame                  1.000
passing_first_downsPerGame                        0.983
avg_completed_air_yards                           0.975
avg_intended_air_yards                            0.999
aggressiveness                                    0.571
max_completed_air_distance                        1.000
avg_air_distance                                  0.953
max_air_distance                                  0.838
avg_air_yards_to_sticks                           0.972
passing_cpoe                                      0.981
pass_comp_pct                                     0.959
passer_rating                                     0.858
completion_percentage_above_expectation           0.954

                              f2    f7    f5    f3    f4    f6    f1  total
Sum of sq (obliq) loadings 5.684 3.602 2.708 2.283 1.278 1.105 0.166 16.826
Proportion of total        0.338 0.214 0.161 0.136 0.076 0.066 0.010  1.000
Proportion var             0.316 0.200 0.150 0.127 0.071 0.061 0.009  0.935
Cumulative var             0.316 0.516 0.666 0.793 0.864 0.926 0.935  0.935

Factor correlations: (* = significant at 1% level)

       f1      f2      f3      f4      f5      f6      f7 
f1  1.000                                                 
f2  0.077   1.000                                         
f3  0.280   0.022   1.000                                 
f4  0.203  -0.143   0.666*  1.000                         
f5 -0.269  -0.292   0.561   0.734*  1.000                 
f6 -0.171   0.316   0.290  -0.182   0.124   1.000         
f7 -0.354   0.091   0.496   0.045   0.259   0.525*  1.000

A path diagram of the three-factor EFA model in Figure 22.17 was created using the lavaanPlot::lavaanPlot() function of the lavaanPlot package (Lishinski, 2024).

Code

lavaanPlot::lavaanPlot(
  efa_fit$nf3,
  coefs = TRUE,
  #covs = TRUE,
  stand = TRUE)

Figure 22.17: Path Diagram of the Three-Factor Exploratory Factor Analysis Model.

To make the plot interactive for editing, you can use the lavaangui::plot_lavaan() function of the lavaangui package (Karch, 2025b; Karch, 2025a):

Code

lavaangui::plot_lavaan(efa_fit$nf3)

Here is the syntax for estimating a three-factor EFA using exploratory structural equation modeling (ESEM). Estimating the model in a ESEM framework allows us to make modifications to the model, such as adding correlated residuals, and adding predictors or outcomes of the latent factors. The syntax below represents the same model (with the same fit indices) as the three-factor EFA model above.

Code

efa3factor_syntax <- '
 # EFA Factor Loadings
 efa("efa1")*f1 + 
 efa("efa1")*f2 + 
 efa("efa1")*f3 =~ completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame + 
 avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks + passing_cpoe + pass_comp_pct + 
 passer_rating + completion_percentage_above_expectation
'

To fit the ESEM model, we use the lavaan::sem() function of the lavaan package (Rosseel, 2012; Rosseel et al., 2024).

Code

efa3factor_fit <- sem(
  efa3factor_syntax,
  data = dataForFA,
  information = "observed",
  missing = "ML",
  estimator = "MLR",
  rotation = "geomin",
  bounds = "standard",
  meanstructure = TRUE,
  em.h1.iter.max = 2000000)

The fit indices suggests that the model does not fit well to the data and that additional model modifications are necessary. The fit indices are below.

Code

summary(
  efa3factor_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 230 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        93
  Row rank of the constraints matrix                24

  Rotation method                       GEOMIN OBLIQUE
  Geomin epsilon                                 0.001
  Rotation algorithm (rstarts)                GPA (30)
  Standardized metric                             TRUE
  Row weights                                     None

  Number of observations                          1997
  Number of missing patterns                         4

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                              5446.238    3182.380
  Degrees of freedom                               102         102
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.711
    Yuan-Bentler correction (Mplus variant)                       

Model Test Baseline Model:

  Test statistic                             42942.078   23496.779
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.828

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.875       0.868
  Tucker-Lewis Index (TLI)                       0.813       0.802
                                                                  
  Robust Comparative Fit Index (CFI)                         0.730
  Robust Tucker-Lewis Index (TLI)                            0.595

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -56723.436  -56723.436
  Scaling correction factor                                  1.784
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  1.745
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              113620.871  113620.871
  Bayesian (BIC)                            114108.019  114108.019
  Sample-size adjusted Bayesian (SABIC)     113831.616  113831.616

Root Mean Square Error of Approximation:

  RMSEA                                          0.162       0.123
  90 Percent confidence interval - lower         0.158       0.120
  90 Percent confidence interval - upper         0.166       0.126
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       1.000
                                                                  
  Robust RMSEA                                               0.399
  90 Percent confidence interval - lower                     0.376
  90 Percent confidence interval - upper                     0.422
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.469       0.469

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  f1 =~ efa1                                                            
    completinsPrGm    7.843    0.085   91.879    0.000    7.843    0.978
    attemptsPerGam   12.462    0.130   95.742    0.000   12.462    0.993
    pssng_yrdsPrGm   91.612    0.955   95.881    0.000   91.612    0.987
    passng_tdsPrGm    0.574    0.010   56.127    0.000    0.574    0.833
    pssng_r_yrdsPG   95.681    2.355   40.623    0.000   95.681    0.791
    pssng_yrds__PG   45.672    1.088   41.967    0.000   45.672    0.697
    pssng_frst_dPG    4.445    0.048   92.412    0.000    4.445    0.979
    avg_cmpltd_r_y    0.016    0.050    0.331    0.741    0.016    0.004
    avg_ntndd_r_yr   -0.043    0.045   -0.965    0.335   -0.043   -0.009
    aggressiveness   -0.289    0.382   -0.757    0.449   -0.289   -0.037
    mx_cmpltd_r_ds    1.678    0.424    3.953    0.000    1.678    0.156
    avg_air_distnc   -0.212    0.076   -2.797    0.005   -0.212   -0.048
    max_air_distnc    1.683    0.435    3.870    0.000    1.683    0.183
    avg_r_yrds_t_s    0.005    0.042    0.129    0.897    0.005    0.001
    passing_cpoe     -0.486    0.232   -2.093    0.036   -0.486   -0.035
    pass_comp_pct     0.000    0.001    0.651    0.515    0.000    0.003
    passer_rating     2.776    0.929    2.988    0.003    2.776    0.085
    cmpltn_prcnt__   -0.433    0.307   -1.414    0.157   -0.433   -0.036
  f2 =~ efa1                                                            
    completinsPrGm    0.024    0.031    0.769    0.442    0.024    0.003
    attemptsPerGam    0.441    0.086    5.134    0.000    0.441    0.035
    pssng_yrdsPrGm   -0.299    0.295   -1.012    0.312   -0.299   -0.003
    passng_tdsPrGm   -0.032    0.009   -3.439    0.001   -0.032   -0.047
    pssng_r_yrdsPG   88.473    1.904   46.460    0.000   88.473    0.732
    pssng_yrds__PG  -39.512    0.982  -40.249    0.000  -39.512   -0.603
    pssng_frst_dPG   -0.029    0.015   -1.960    0.050   -0.029   -0.006
    avg_cmpltd_r_y    3.212    0.187   17.167    0.000    3.212    0.784
    avg_ntndd_r_yr    4.141    0.205   20.227    0.000    4.141    0.887
    aggressiveness    6.305    0.920    6.857    0.000    6.305    0.796
    mx_cmpltd_r_ds    7.115    0.802    8.872    0.000    7.115    0.662
    avg_air_distnc    3.836    0.223   17.203    0.000    3.836    0.871
    max_air_distnc    7.399    0.757    9.773    0.000    7.399    0.806
    avg_r_yrds_t_s    4.070    0.241   16.910    0.000    4.070    0.861
    passing_cpoe     -0.197    0.446   -0.442    0.659   -0.197   -0.014
    pass_comp_pct    -0.062    0.006  -10.310    0.000   -0.062   -0.474
    passer_rating     0.515    0.931    0.553    0.580    0.515    0.016
    cmpltn_prcnt__    0.460    0.543    0.847    0.397    0.460    0.038
  f3 =~ efa1                                                            
    completinsPrGm    0.412    0.051    8.113    0.000    0.412    0.051
    attemptsPerGam   -0.681    0.078   -8.746    0.000   -0.681   -0.054
    pssng_yrdsPrGm    4.081    0.466    8.749    0.000    4.081    0.044
    passng_tdsPrGm    0.076    0.009    8.469    0.000    0.076    0.110
    pssng_r_yrdsPG   -2.190    1.685   -1.299    0.194   -2.190   -0.018
    pssng_yrds__PG    0.357    0.496    0.720    0.472    0.357    0.005
    pssng_frst_dPG    0.257    0.027    9.467    0.000    0.257    0.057
    avg_cmpltd_r_y    1.359    0.238    5.716    0.000    1.359    0.332
    avg_ntndd_r_yr    0.975    0.197    4.962    0.000    0.975    0.209
    aggressiveness    0.088    0.614    0.144    0.886    0.088    0.011
    mx_cmpltd_r_ds    4.079    0.986    4.138    0.000    4.079    0.379
    avg_air_distnc    0.920    0.212    4.333    0.000    0.920    0.209
    max_air_distnc    1.386    0.812    1.706    0.088    1.386    0.151
    avg_r_yrds_t_s    1.148    0.226    5.071    0.000    1.148    0.243
    passing_cpoe     13.717    0.599   22.920    0.000   13.717    1.001
    pass_comp_pct     0.138    0.005   26.672    0.000    0.138    1.055
    passer_rating    30.269    2.324   13.026    0.000   30.269    0.930
    cmpltn_prcnt__   11.669    0.735   15.880    0.000   11.669    0.964

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  f1 ~~                                                                 
    f2               -0.145    0.028   -5.189    0.000   -0.145   -0.145
    f3                0.158    0.036    4.391    0.000    0.158    0.158
  f2 ~~                                                                 
    f3                0.446    0.038   11.657    0.000    0.446    0.446

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm   13.257    0.179   73.894    0.000   13.257    1.654
   .attemptsPerGam   21.899    0.281   77.987    0.000   21.899    1.745
   .pssng_yrdsPrGm  148.475    2.078   71.456    0.000  148.475    1.599
   .passng_tdsPrGm    0.850    0.015   55.064    0.000    0.850    1.232
   .pssng_r_yrdsPG  133.536    2.706   49.349    0.000  133.536    1.104
   .pssng_yrds__PG   87.878    1.466   59.924    0.000   87.878    1.341
   .pssng_frst_dPG    7.166    0.102   70.504    0.000    7.166    1.578
   .avg_cmpltd_r_y    3.688    0.167   22.072    0.000    3.688    0.901
   .avg_ntndd_r_yr    5.718    0.166   34.429    0.000    5.718    1.225
   .aggressiveness   13.715    0.580   23.659    0.000   13.715    1.731
   .mx_cmpltd_r_ds   33.432    0.638   52.365    0.000   33.432    3.109
   .avg_air_distnc   19.136    0.175  109.577    0.000   19.136    4.344
   .max_air_distnc   42.843    0.662   64.729    0.000   42.843    4.668
   .avg_r_yrds_t_s   -3.302    0.198  -16.672    0.000   -3.302   -0.699
   .passing_cpoe     -6.307    0.443  -14.237    0.000   -6.307   -0.460
   .pass_comp_pct     0.591    0.003  179.174    0.000    0.591    4.523
   .passer_rating    71.677    1.786   40.132    0.000   71.677    2.202
   .cmpltn_prcnt__   -5.709    0.521  -10.956    0.000   -5.709   -0.472

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm    1.618    0.117   13.781    0.000    1.618    0.025
   .attemptsPerGam    6.034    0.389   15.526    0.000    6.034    0.038
   .pssng_yrdsPrGm   87.431    8.391   10.420    0.000   87.431    0.010
   .passng_tdsPrGm    0.122    0.006   20.555    0.000    0.122    0.257
   .pssng_r_yrdsPG  327.643   30.954   10.585    0.000  327.643    0.022
   .pssng_yrds__PG  132.016    8.928   14.787    0.000  132.016    0.031
   .pssng_frst_dPG    0.413    0.026   15.887    0.000    0.413    0.020
   .avg_cmpltd_r_y    0.717    0.075    9.561    0.000    0.717    0.043
   .avg_ntndd_r_yr    0.033    0.013    2.517    0.012    0.033    0.002
   .aggressiveness   21.887    2.181   10.035    0.000   21.887    0.349
   .mx_cmpltd_r_ds   20.925    1.927   10.858    0.000   20.925    0.181
   .avg_air_distnc    0.470    0.032   14.534    0.000    0.470    0.024
   .max_air_distnc   18.462    1.911    9.661    0.000   18.462    0.219
   .avg_r_yrds_t_s    0.272    0.033    8.154    0.000    0.272    0.012
   .passing_cpoe      3.920    1.101    3.559    0.000    3.920    0.021
   .pass_comp_pct     0.002    0.000    6.535    0.000    0.002    0.107
   .passer_rating    95.570   15.835    6.035    0.000   95.570    0.090
   .cmpltn_prcnt__    6.744    0.908    7.430    0.000    6.744    0.046
    f1                1.000                               1.000    1.000
    f2                1.000                               1.000    1.000
    f3                1.000                               1.000    1.000

R-Square:
                   Estimate
    completinsPrGm    0.975
    attemptsPerGam    0.962
    pssng_yrdsPrGm    0.990
    passng_tdsPrGm    0.743
    pssng_r_yrdsPG    0.978
    pssng_yrds__PG    0.969
    pssng_frst_dPG    0.980
    avg_cmpltd_r_y    0.957
    avg_ntndd_r_yr    0.998
    aggressiveness    0.651
    mx_cmpltd_r_ds    0.819
    avg_air_distnc    0.976
    max_air_distnc    0.781
    avg_r_yrds_t_s    0.988
    passing_cpoe      0.979
    pass_comp_pct     0.893
    passer_rating     0.910
    cmpltn_prcnt__    0.954

Code

lavaan::fitMeasures(
  efa3factor_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
            5446.238              102.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
            3182.380              102.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.711            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.162                0.875 
                 tli                 srmr         rmsea.robust 
               0.813                0.469                0.399 
          cfi.robust           tli.robust 
               0.730                0.595

Code

lavaan::residuals(
  efa3factor_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                          0.020  0.000                    
passing_yardsPerGame                    -0.004 -0.007     0.000          
passing_tdsPerGame                      -0.022 -0.043     0.014     0.000
passing_air_yardsPerGame                 0.003  0.007     0.001    -0.011
passing_yards_after_catchPerGame        -0.006  0.001     0.006     0.001
passing_first_downsPerGame               0.000 -0.006     0.003     0.019
avg_completed_air_yards                 -0.103 -0.084    -0.056    -0.018
avg_intended_air_yards                  -0.083 -0.073    -0.052    -0.022
aggressiveness                          -0.044 -0.013    -0.031    -0.021
max_completed_air_distance               0.088  0.111     0.135     0.157
avg_air_distance                        -0.086 -0.072    -0.054    -0.025
max_air_distance                         0.038  0.042     0.053     0.069
avg_air_yards_to_sticks                 -0.082 -0.070    -0.048    -0.010
passing_cpoe                             0.022  0.022     0.027     0.024
pass_comp_pct                            0.012  0.009     0.000    -0.016
passer_rating                            0.070  0.065     0.105     0.184
completion_percentage_above_expectation  0.000 -0.003     0.000    -0.004
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame            -0.008  0.000                   
passing_first_downsPerGame                  -0.003 -0.002      0.000        
avg_completed_air_yards                     -0.071 -0.041     -0.066   0.000
avg_intended_air_yards                      -0.064 -0.010     -0.058  -0.017
aggressiveness                              -0.168  0.112     -0.021  -0.120
max_completed_air_distance                  -0.062  0.226      0.100  -0.185
avg_air_distance                            -0.085  0.010     -0.065  -0.038
max_air_distance                            -0.018  0.113      0.037  -0.133
avg_air_yards_to_sticks                     -0.072  0.003     -0.050  -0.024
passing_cpoe                                -0.026  0.061      0.025  -0.392
pass_comp_pct                                0.003  0.000      0.001  -0.412
passer_rating                               -0.181  0.288      0.099  -0.601
completion_percentage_above_expectation     -0.001 -0.001      0.001  -0.305
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                           -0.129  0.000                     
max_completed_air_distance               -0.238 -0.295  0.000              
avg_air_distance                         -0.012 -0.134 -0.214  0.000       
max_air_distance                         -0.090 -0.232 -0.010 -0.080  0.000
avg_air_yards_to_sticks                  -0.007 -0.122 -0.228 -0.019 -0.100
passing_cpoe                             -0.332 -0.421 -0.285 -0.345 -0.187
pass_comp_pct                            -0.341 -0.368 -0.237 -0.361 -0.148
passer_rating                            -0.582 -0.601 -0.370 -0.592 -0.371
completion_percentage_above_expectation  -0.253 -0.298 -0.250 -0.266 -0.137
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                            -0.350  0.000                     
pass_comp_pct                           -0.368  0.027  0.000              
passer_rating                           -0.591 -0.085  0.079  0.000       
completion_percentage_above_expectation -0.274 -0.003 -0.011 -0.132  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                   0.423 
                 avg_intended_air_yards                          aggressiveness 
                                  0.298                                   0.290 
             max_completed_air_distance                        avg_air_distance 
                                  0.441                                   0.310 
                       max_air_distance                 avg_air_yards_to_sticks 
                                  0.155                                   0.361 
                           passing_cpoe                           pass_comp_pct 
                                  0.039                                  -0.002 
                          passer_rating completion_percentage_above_expectation 
                                  0.265                                   0.020

We can examine the model modification indices to identify parameters that, if estimated, would substantially improve model fit. For instance, the modification indices below indicate additional correlated residuals that could substantially improve model fit. However, it is generally not recommended to blindly estimate additional parameters solely based on modification indices, which can lead to data dredging and overfitting. Rather, it is generally advised to consider modification indices in light of theory. Based on the modification indices, we will add several correlated residuals to the model, to help account for why variables are associated with each other for reasons other than their underlying latent factors.

Code

lavaan::modificationindices(
  efa3factor_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

efa3factor_factorScores <- lavaan::lavPredict(efa3factor_fit)

head(efa3factor_factorScores)

              f1          f2         f3
[1,] -0.09754727 -1.27792310  0.3402415
[2,]  0.22104565 -1.66217518 -0.9854990
[3,]  0.43811981 -1.83305997 -1.2122425
[4,]  0.06686331  0.08889608  0.7651670
[5,]  0.89608461  1.16088344  0.4625355
[6,] -0.02801636  0.13318347 -0.2137671

A path diagram of the three-factor ESEM model is in Figure 22.18.

Code

lavaanPlot::lavaanPlot(
  efa3factor_fit,
  coefs = TRUE,
  covs = TRUE,
  stand = TRUE)

Figure 22.18: Path Diagram of the Three-Factor Exploratory Structural Equation Model.

To make the plot interactive for editing, you can use the lavaangui::plot_lavaan() function of the lavaangui package (Karch, 2025b; Karch, 2025a):

Code

lavaangui::plot_lavaan(efa3factor_fit)

Below is a modification of the three-factor model with correlated residuals. For instance, it makes sense that passing completions and attempts are related to each other (even after accounting for their latent factor).

Code

efa3factorModified_syntax <- '
 # EFA Factor Loadings
 efa("efa1")*F1 + 
 efa("efa1")*F2 + 
 efa("efa1")*F3 =~ completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame + 
 avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks + passing_cpoe + pass_comp_pct + 
 passer_rating + completion_percentage_above_expectation
 
 # Correlated Residuals
 completionsPerGame ~~ attemptsPerGame
 passing_yardsPerGame ~~ passing_yards_after_catchPerGame
 attemptsPerGame ~~ passing_yardsPerGame
 attemptsPerGame ~~ passing_air_yardsPerGame
 passing_air_yardsPerGame ~~ passing_yards_after_catchPerGame
 passing_yards_after_catchPerGame ~~ avg_completed_air_yards
 passing_tdsPerGame ~~ passer_rating
 completionsPerGame ~~ passing_air_yardsPerGame
 max_completed_air_distance ~~ max_air_distance
 aggressiveness ~~ completion_percentage_above_expectation
'

Code

efa3factorModified_fit <- sem(
  efa3factorModified_syntax,
  data = dataForFA,
  information = "observed",
  missing = "ML",
  estimator = "MLR",
  rotation = "geomin",
  bounds = "standard",
  meanstructure = TRUE,
  em.h1.iter.max = 2000000)

Code

summary(
  efa3factorModified_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 806 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                       103
  Row rank of the constraints matrix                34

  Rotation method                       GEOMIN OBLIQUE
  Geomin epsilon                                 0.001
  Rotation algorithm (rstarts)                GPA (30)
  Standardized metric                             TRUE
  Row weights                                     None

  Number of observations                          1997
  Number of missing patterns                         4

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                              1398.863     844.272
  Degrees of freedom                                92          92
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.657
    Yuan-Bentler correction (Mplus variant)                       

Model Test Baseline Model:

  Test statistic                             42942.078   23496.779
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.828

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.969       0.968
  Tucker-Lewis Index (TLI)                       0.949       0.946
                                                                  
  Robust Comparative Fit Index (CFI)                         0.946
  Robust Tucker-Lewis Index (TLI)                            0.911

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -54699.748  -54699.748
  Scaling correction factor                                  1.828
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  1.745
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              109593.497  109593.497
  Bayesian (BIC)                            110136.639  110136.639
  Sample-size adjusted Bayesian (SABIC)     109828.465  109828.465

Root Mean Square Error of Approximation:

  RMSEA                                          0.084       0.064
  90 Percent confidence interval - lower         0.080       0.061
  90 Percent confidence interval - upper         0.088       0.067
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    0.967       0.000
                                                                  
  Robust RMSEA                                               0.191
  90 Percent confidence interval - lower                     0.163
  90 Percent confidence interval - upper                     0.218
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.141       0.141

Parameter Estimates:

  Standard errors                             Sandwich
  Information bread                           Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  F1 =~ efa1                                                            
    completinsPrGm    7.797    0.091   85.802    0.000    7.797    0.972
    attemptsPerGam   12.288    0.136   90.140    0.000   12.288    0.980
    pssng_yrdsPrGm   91.931    1.008   91.164    0.000   91.931    0.990
    passng_tdsPrGm    0.589    0.010   56.990    0.000    0.589    0.854
    pssng_r_yrdsPG   89.910    3.331   26.990    0.000   89.910    0.744
    pssng_yrds__PG   47.857    1.598   29.950    0.000   47.857    0.730
    pssng_frst_dPG    4.468    0.051   87.821    0.000    4.468    0.984
    avg_cmpltd_r_y    0.012    0.105    0.115    0.909    0.012    0.004
    avg_ntndd_r_yr   -0.059    0.113   -0.527    0.598   -0.059   -0.017
    aggressiveness    0.017    0.251    0.069    0.945    0.017    0.003
    mx_cmpltd_r_ds    1.694    0.508    3.331    0.001    1.694    0.200
    avg_air_distnc   -0.239    0.122   -1.963    0.050   -0.239   -0.073
    max_air_distnc    1.460    0.458    3.188    0.001    1.460    0.197
    avg_r_yrds_t_s    0.006    0.108    0.060    0.952    0.006    0.002
    passing_cpoe     -0.846    0.590   -1.434    0.152   -0.846   -0.058
    pass_comp_pct     0.001    0.003    0.307    0.759    0.001    0.008
    passer_rating     2.986    1.166    2.561    0.010    2.986    0.111
    cmpltn_prcnt__   -0.693    0.629   -1.102    0.271   -0.693   -0.056
  F2 =~ efa1                                                            
    completinsPrGm   -0.096    0.040   -2.382    0.017   -0.096   -0.012
    attemptsPerGam    0.085    0.178    0.476    0.634    0.085    0.007
    pssng_yrdsPrGm    0.880    0.362    2.433    0.015    0.880    0.009
    passng_tdsPrGm   -0.001    0.009   -0.169    0.866   -0.001   -0.002
    pssng_r_yrdsPG   83.739    2.631   31.824    0.000   83.739    0.693
    pssng_yrds__PG  -36.815    1.093  -33.685    0.000  -36.815   -0.561
    pssng_frst_dPG   -0.000    0.016   -0.017    0.986   -0.000   -0.000
    avg_cmpltd_r_y    2.950    0.141   20.905    0.000    2.950    0.954
    avg_ntndd_r_yr    3.498    0.150   23.295    0.000    3.498    1.013
    aggressiveness    5.141    0.792    6.491    0.000    5.141    0.791
    mx_cmpltd_r_ds    5.949    0.646    9.215    0.000    5.949    0.702
    avg_air_distnc    3.205    0.167   19.246    0.000    3.205    0.975
    max_air_distnc    6.190    0.644    9.619    0.000    6.190    0.834
    avg_r_yrds_t_s    3.413    0.182   18.706    0.000    3.413    0.986
    passing_cpoe      0.921    1.476    0.624    0.533    0.921    0.063
    pass_comp_pct    -0.069    0.016   -4.269    0.000   -0.069   -0.531
    passer_rating    -0.462    0.932   -0.496    0.620   -0.462   -0.017
    cmpltn_prcnt__    0.971    1.125    0.863    0.388    0.971    0.078
  F3 =~ efa1                                                            
    completinsPrGm    0.430    0.175    2.462    0.014    0.430    0.054
    attemptsPerGam   -0.645    0.227   -2.846    0.004   -0.645   -0.051
    pssng_yrdsPrGm    2.834    1.982    1.430    0.153    2.834    0.031
    passng_tdsPrGm    0.053    0.018    2.957    0.003    0.053    0.077
    pssng_r_yrdsPG    0.740    1.428    0.519    0.604    0.740    0.006
    pssng_yrds__PG   -1.843    2.357   -0.782    0.434   -1.843   -0.028
    pssng_frst_dPG    0.201    0.104    1.930    0.054    0.201    0.044
    avg_cmpltd_r_y    0.064    0.139    0.461    0.645    0.064    0.021
    avg_ntndd_r_yr   -0.116    0.111   -1.043    0.297   -0.116   -0.034
    aggressiveness   -2.493    0.725   -3.436    0.001   -2.493   -0.383
    mx_cmpltd_r_ds    1.793    0.856    2.094    0.036    1.793    0.211
    avg_air_distnc   -0.051    0.123   -0.416    0.678   -0.051   -0.015
    max_air_distnc   -0.418    0.669   -0.625    0.532   -0.418   -0.056
    avg_r_yrds_t_s    0.017    0.144    0.120    0.904    0.017    0.005
    passing_cpoe     14.087    1.104   12.766    0.000   14.087    0.963
    pass_comp_pct     0.142    0.009   15.066    0.000    0.142    1.090
    passer_rating    24.552    2.800    8.769    0.000   24.552    0.910
    cmpltn_prcnt__   11.768    1.157   10.169    0.000   11.768    0.950

Covariances:
                                       Estimate  Std.Err  z-value  P(>|z|)
 .completionsPerGame ~~                                                   
   .attemptsPerGam                        3.305    0.162   20.351    0.000
 .passing_yardsPerGame ~~                                                 
   .pssng_yrds__PG                       92.975    7.884   11.793    0.000
 .attemptsPerGame ~~                                                      
   .pssng_yrdsPrGm                        3.470    0.660    5.257    0.000
   .pssng_r_yrdsPG                       52.748    2.974   17.737    0.000
 .passing_air_yardsPerGame ~~                                             
   .pssng_yrds__PG                     -231.836   12.847  -18.046    0.000
 .passing_yards_after_catchPerGame ~~                                     
   .avg_cmpltd_r_y                       -6.025    0.423  -14.229    0.000
 .passing_tdsPerGame ~~                                                   
   .passer_rating                         1.760    0.157   11.227    0.000
 .completionsPerGame ~~                                                   
   .pssng_r_yrdsPG                       12.979    1.159   11.203    0.000
 .max_completed_air_distance ~~                                           
   .max_air_distnc                        9.119    1.380    6.608    0.000
 .aggressiveness ~~                                                       
   .cmpltn_prcnt__                        5.907    0.887    6.663    0.000
  F1 ~~                                                                   
    F2                                   -0.096    0.040   -2.384    0.017
    F3                                    0.174    0.041    4.278    0.000
  F2 ~~                                                                   
    F3                                    0.464    0.106    4.391    0.000
   Std.lv  Std.all
                  
    3.305    0.778
                  
   92.975    0.607
                  
    3.470    0.128
   52.748    0.598
                  
 -231.836   -0.466
                  
   -6.025   -0.435
                  
    1.760    0.518
                  
   12.979    0.308
                  
    9.119    0.455
                  
    5.907    0.531
                  
   -0.096   -0.096
    0.174    0.174
                  
    0.464    0.464

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm   13.257    0.179   73.894    0.000   13.257    1.654
   .attemptsPerGam   21.899    0.281   77.987    0.000   21.899    1.746
   .pssng_yrdsPrGm  148.475    2.078   71.456    0.000  148.475    1.599
   .passng_tdsPrGm    0.850    0.015   55.064    0.000    0.850    1.232
   .pssng_r_yrdsPG  133.536    2.706   49.349    0.000  133.536    1.105
   .pssng_yrds__PG   87.878    1.466   59.924    0.000   87.878    1.340
   .pssng_frst_dPG    7.166    0.102   70.504    0.000    7.166    1.578
   .avg_cmpltd_r_y    4.330    0.127   34.162    0.000    4.330    1.401
   .avg_ntndd_r_yr    6.466    0.129   50.250    0.000    6.466    1.872
   .aggressiveness   15.249    0.490   31.098    0.000   15.249    2.346
   .mx_cmpltd_r_ds   34.809    0.519   67.009    0.000   34.809    4.105
   .avg_air_distnc   19.834    0.135  146.617    0.000   19.834    6.033
   .max_air_distnc   44.247    0.550   80.389    0.000   44.247    5.965
   .avg_r_yrds_t_s   -2.545    0.157  -16.256    0.000   -2.545   -0.735
   .passing_cpoe     -7.195    0.467  -15.395    0.000   -7.195   -0.492
   .pass_comp_pct     0.591    0.003  180.271    0.000    0.591    4.527
   .passer_rating    74.014    1.614   45.850    0.000   74.014    2.743
   .cmpltn_prcnt__   -6.332    0.474  -13.351    0.000   -6.332   -0.511

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm    2.029    0.103   19.689    0.000    2.029    0.032
   .attemptsPerGam    8.890    0.350   25.382    0.000    8.890    0.057
   .pssng_yrdsPrGm   83.182    8.569    9.708    0.000   83.182    0.010
   .passng_tdsPrGm    0.115    0.006   19.797    0.000    0.115    0.241
   .pssng_r_yrdsPG  876.338    6.356  137.871    0.000  876.338    0.060
   .pssng_yrds__PG  281.884   11.379   24.772    0.000  281.884    0.066
   .pssng_frst_dPG    0.318    0.027   11.759    0.000    0.318    0.015
   .avg_cmpltd_r_y    0.680    0.064   10.676    0.000    0.680    0.071
   .avg_ntndd_r_yr    0.005    0.014    0.371    0.711    0.005    0.000
   .aggressiveness   21.544    2.130   10.114    0.000   21.544    0.510
   .mx_cmpltd_r_ds   21.394    1.956   10.936    0.000   21.394    0.298
   .avg_air_distnc    0.480    0.033   14.674    0.000    0.480    0.044
   .max_air_distnc   18.762    1.910    9.822    0.000   18.762    0.341
   .avg_r_yrds_t_s    0.286    0.034    8.375    0.000    0.286    0.024
   .passing_cpoe      6.089    4.002    1.522    0.128    6.089    0.028
   .pass_comp_pct     0.001    0.001    2.157    0.031    0.001    0.064
   .passer_rating   100.731   13.986    7.203    0.000  100.731    0.138
   .cmpltn_prcnt__    5.735    0.993    5.778    0.000    5.735    0.037
    F1                1.000                               1.000    1.000
    F2                1.000                               1.000    1.000
    F3                1.000                               1.000    1.000

R-Square:
                   Estimate
    completinsPrGm    0.968
    attemptsPerGam    0.943
    pssng_yrdsPrGm    0.990
    passng_tdsPrGm    0.759
    pssng_r_yrdsPG    0.940
    pssng_yrds__PG    0.934
    pssng_frst_dPG    0.985
    avg_cmpltd_r_y    0.929
    avg_ntndd_r_yr    1.000
    aggressiveness    0.490
    mx_cmpltd_r_ds    0.702
    avg_air_distnc    0.956
    max_air_distnc    0.659
    avg_r_yrds_t_s    0.976
    passing_cpoe      0.972
    pass_comp_pct     0.936
    passer_rating     0.862
    cmpltn_prcnt__    0.963

The model fits substantially better (though not perfectly) with the additional correlated residuals. Below are the model fit indices:

Code

lavaan::fitMeasures(
  efa3factorModified_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
            1398.863               92.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
             844.272               92.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.657            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.084                0.969 
                 tli                 srmr         rmsea.robust 
               0.949                0.141                0.191 
          cfi.robust           tli.robust 
               0.946                0.911

Code

lavaan::residuals(
  efa3factorModified_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                          0.000  0.000                    
passing_yardsPerGame                    -0.001 -0.001     0.000          
passing_tdsPerGame                      -0.030 -0.047     0.004     0.000
passing_air_yardsPerGame                 0.004  0.003    -0.001    -0.036
passing_yards_after_catchPerGame        -0.004 -0.002     0.001     0.017
passing_first_downsPerGame               0.001  0.000     0.000     0.007
avg_completed_air_yards                 -0.052 -0.045    -0.025     0.004
avg_intended_air_yards                  -0.046 -0.041    -0.035    -0.017
aggressiveness                          -0.025 -0.021    -0.032    -0.022
max_completed_air_distance               0.059  0.079     0.090     0.110
avg_air_distance                        -0.037 -0.029    -0.025    -0.010
max_air_distance                         0.043  0.042     0.039     0.047
avg_air_yards_to_sticks                 -0.053 -0.047    -0.040    -0.012
passing_cpoe                             0.050  0.054     0.055     0.053
pass_comp_pct                            0.002  0.000     0.001    -0.002
passer_rating                            0.038  0.037     0.072     0.064
completion_percentage_above_expectation  0.016  0.019     0.016     0.013
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame             0.001  0.000                   
passing_first_downsPerGame                  -0.002  0.003      0.000        
avg_completed_air_yards                     -0.054  0.006     -0.027   0.000
avg_intended_air_yards                      -0.045 -0.011     -0.033  -0.009
aggressiveness                              -0.024 -0.031     -0.013   0.048
max_completed_air_distance                  -0.058  0.157      0.061  -0.076
avg_air_distance                            -0.049  0.011     -0.028  -0.020
max_air_distance                             0.035  0.044      0.029  -0.050
avg_air_yards_to_sticks                     -0.061 -0.003     -0.033  -0.007
passing_cpoe                                -0.065  0.131      0.055  -0.233
pass_comp_pct                                0.002  0.000      0.000  -0.127
passer_rating                               -0.190  0.251      0.069  -0.353
completion_percentage_above_expectation     -0.029  0.044      0.019  -0.131
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                            0.039  0.000                     
max_completed_air_distance               -0.144 -0.047  0.000              
avg_air_distance                         -0.002  0.043 -0.110  0.000       
max_air_distance                         -0.012 -0.039 -0.028  0.010  0.000
avg_air_yards_to_sticks                  -0.002  0.056 -0.134 -0.002 -0.019
passing_cpoe                             -0.225 -0.097 -0.191 -0.243 -0.057
pass_comp_pct                            -0.111 -0.026 -0.092 -0.142  0.040
passer_rating                            -0.378 -0.225 -0.223 -0.391 -0.179
completion_percentage_above_expectation  -0.127 -0.028 -0.144 -0.145  0.010
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                            -0.246  0.000                     
pass_comp_pct                           -0.145  0.067  0.000              
passer_rating                           -0.394 -0.044  0.088  0.000       
completion_percentage_above_expectation -0.151 -0.004  0.008 -0.099  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                   0.182 
                 avg_intended_air_yards                          aggressiveness 
                                  0.078                                   0.059 
             max_completed_air_distance                        avg_air_distance 
                                  0.242                                   0.093 
                       max_air_distance                 avg_air_yards_to_sticks 
                                 -0.032                                   0.129 
                           passing_cpoe                           pass_comp_pct 
                                  0.106                                  -0.001 
                          passer_rating completion_percentage_above_expectation 
                                  0.174                                   0.073

Code

lavaan::modificationindices(
  efa3factorModified_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

efa3factorModified_factorScores <- lavaan::lavPredict(efa3factorModified_fit)

head(efa3factorModified_factorScores)

              F1          F2         F3
[1,] -0.04015673 -0.92276024  0.3926929
[2,]  0.18405042 -1.63622731 -1.0333378
[3,]  0.42256795 -2.26494419 -1.5053909
[4,]  0.15119239  0.28877480  0.7973455
[5,]  0.83159350  1.09556814  0.4306339
[6,] -0.13577156  0.08173586 -0.1962236

Code

dataForFA_efa <- cbind(dataForFA, efa3factorModified_factorScores)

The path diagram of the modified ESEM model with correlated residuals is in Figure 22.19.

Code

lavaanPlot::lavaanPlot(
  efa3factorModified_fit,
  coefs = TRUE,
  #covs = TRUE,
  stand = TRUE)

Figure 22.19: Path Diagram of the Three-Factor Exploratory Structural Equation Model With Correlated Residuals.

Code

lavaangui::plot_lavaan(efa3factorModified_fit)

Model fit of nested models can be compared with a chi-square difference test. This allows us to evaluate whether the more complex model fits signficantly better than the less complex model. In this case, our more complex model is the three-factor model with correlated residuals; the less complex model is the three-factor model without correlated residuals. The modified model with the correlated residuals and the original model are considered “nested” models. The original model is nested within the modified model because the modified model includes all of the terms of the original model along with additional terms.

Code

anova(
  efa3factorModified_fit,
  efa3factor_fit
)

In this case, the model with the correlated residuals fit significantly better (i.e., has a significantly smaller chi-square value) than the model without the correlated residuals.

Here are the variables that had a standardized loading greater than 0.3 on each of the factors:

Code

factor1vars <- c(
  "completionsPerGame","attemptsPerGame","passing_yardsPerGame","passing_tdsPerGame",
  "passing_air_yardsPerGame","passing_yards_after_catchPerGame","passing_first_downsPerGame")

factor2vars <- c(
  "avg_completed_air_yards","avg_intended_air_yards","aggressiveness","max_completed_air_distance",
  "avg_air_distance","max_air_distance","avg_air_yards_to_sticks")

factor3vars <- c(
  "passing_cpoe","pass_comp_pct","passer_rating","completion_percentage_above_expectation")

The variables that loaded most strongly onto factor 1 appear to reflect Quarterback usage: completions per game, passing attempts per game, passing yards per game, passing touchdowns per game, passing air yards (total horizontal distance the ball travels on all pass attempts) per game, passing yards after the catch per game, and first downs gained per game by passing. Quarterbacks who tend to throw more tend to have higher levels on those variables. Thus, we label component 1 as “Usage”, which reflects total Quarterback involvement, regardless of efficiency or outcome.

The variables that loaded most strongly onto factor 2 appear to reflect Quarterback aggressiveness: average air yards on completed passes, average air yards on all attempted passes, aggressiveness (percentage of passing attempts thrown into tight windows, where there is a defender within one yard or less of the receiver at the time of the completion or incompletion), average amount of air yards ahead of or behind the first down marker on passing attempts, average air distance (the true three-dimensional distance the ball travels in the air), maximum air distance, and maximum air distance on completed passes. Quarterbacks who throw the ball farther and into tighter windows tend to have higher values on those variables. Thus, we label component 2 as “Aggressiveness”, which reflects throwing longer, more difficult passes with a tight window.

The variables that loaded most strongly onto factor 3 appear to reflect Quarterback performance: passing completion percentage above expectation, pass completion percentage, and passer rating. Quarterbacks who perform better tend to have higher values on those variables. Thus, we label component 3 as “Performance”.

Here are the players and seasons that showed the highest levels of Quarterback “Usage”:

Code

dataForFA_efa %>% 
  arrange(-F1) %>% 
  select(player_display_name, season, F1, all_of(factor1vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Usage”:

Code

dataForFA_efa %>% 
  arrange(F1) %>% 
  select(player_display_name, season, F1, all_of(factor1vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the highest levels of Quarterback “Aggressiveness”:

Code

dataForFA_efa %>% 
  arrange(-F2) %>% 
  select(player_display_name, season, F2, all_of(factor2vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Aggressiveness”:

Code

dataForFA_efa %>% 
  arrange(F2) %>% 
  select(player_display_name, season, F2, all_of(factor2vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the highest levels of Quarterback “Performance”:

Code

dataForFA_efa %>% 
  arrange(-F3) %>% 
  select(player_display_name, season, games, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

If we restrict it to Quarterbacks who played at least 10 games in the season, here are the players and seasons that showed the highest levels of Quarterback “Performance”:

Code

dataForFA_efa %>% 
  arrange(-F3) %>% 
  filter(games >= 10) %>% 
  select(player_display_name, season, games, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Performance”:

Code

dataForFA_efa %>% 
  arrange(F3) %>% 
  select(player_display_name, season, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

If we restrict it to Quarterbacks who played at least 10 games in the season, here are the players and seasons that showed the lowest levels of Quarterback “Performance”:

Code

dataForFA_efa %>% 
  arrange(F3) %>% 
  filter(games >= 10) %>% 
  select(player_display_name, season, games, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

22.7 Example of Confirmatory Factor Analysis

We fit CFA models using the lavaan::cfa() function of the lavaan package (Rosseel, 2012; Rosseel et al., 2024). We compare one-, two-, and three-factor CFA models.

Below is the syntax for the one-factor CFA model:

Code

cfa1factor_syntax <- '
 #Factor loadings
 F1 =~ completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame + 
 avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks + passing_cpoe + pass_comp_pct + 
 passer_rating + completion_percentage_above_expectation
'

Code

cfa1factor_fit <- lavaan::cfa(
  cfa1factor_syntax,
  data = dataForFA,
  missing = "ML",
  estimator = "MLR",
  bounds = "standard",
  std.lv = TRUE,
  cluster = "player_id", # account for nested data within the same player (i.e., longitudinal data)
  em.h1.iter.max = 2000000)

Code

summary(
  cfa1factor_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 368 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        54
  Row rank of the constraints matrix                36

  Number of observations                          1997
  Number of clusters [player_id]                   397
  Number of missing patterns                         4

Model Test User Model:
                                               Standard      Scaled
  Test Statistic                              16714.647    8844.902
  Degrees of freedom                                135         135
  P-value (Chi-square)                            0.000       0.000
  Scaling correction factor                                   1.890
    Yuan-Bentler correction (Mplus variant)                        

Model Test Baseline Model:

  Test statistic                             42942.078   22341.153
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.922

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.613       0.607
  Tucker-Lewis Index (TLI)                       0.561       0.555
                                                                  
  Robust Comparative Fit Index (CFI)                         0.222
  Robust Tucker-Lewis Index (TLI)                            0.119

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -62357.640  -62357.640
  Scaling correction factor                                  2.558
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  2.081
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              124823.280  124823.280
  Bayesian (BIC)                            125125.648  125125.648
  Sample-size adjusted Bayesian (SABIC)     124954.087  124954.087

Root Mean Square Error of Approximation:

  RMSEA                                          0.248       0.180
  90 Percent confidence interval - lower         0.245       0.177
  90 Percent confidence interval - upper         0.251       0.182
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       1.000
                                                                  
  Robust RMSEA                                               0.568
  90 Percent confidence interval - lower                     0.550
  90 Percent confidence interval - upper                     0.585
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.378       0.378

Parameter Estimates:

  Standard errors                       Robust.cluster
  Information                                 Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  F1 =~                                                                 
    completinsPrGm    7.910    0.123   64.370    0.000    7.910    0.987
    attemptsPerGam   12.223    0.170   71.817    0.000   12.223    0.974
    pssng_yrdsPrGm   92.316    1.484   62.197    0.000   92.316    0.994
    passng_tdsPrGm    0.596    0.019   31.441    0.000    0.596    0.864
    pssng_r_yrdsPG   83.627    3.622   23.090    0.000   83.627    0.692
    pssng_yrds__PG   50.523    1.525   33.132    0.000   50.523    0.771
    pssng_frst_dPG    4.505    0.079   56.960    0.000    4.505    0.992
    avg_cmpltd_r_y    0.410    0.105    3.914    0.000    0.410    0.284
    avg_ntndd_r_yr    0.212    0.107    1.976    0.048    0.212    0.152
    aggressiveness   -0.274    0.477   -0.574    0.566   -0.274   -0.053
    mx_cmpltd_r_ds    2.798    0.423    6.618    0.000    2.798    0.472
    avg_air_distnc    0.029    0.115    0.249    0.803    0.029    0.020
    max_air_distnc    1.952    0.435    4.489    0.000    1.952    0.367
    avg_r_yrds_t_s    0.316    0.114    2.784    0.005    0.316    0.213
    passing_cpoe      3.463    0.520    6.657    0.000    3.463    0.292
    pass_comp_pct     0.039    0.005    7.468    0.000    0.039    0.294
    passer_rating    11.654    1.169    9.972    0.000   11.654    0.627
    cmpltn_prcnt__    2.980    0.412    7.241    0.000    2.980    0.519

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm   13.257    0.391   33.868    0.000   13.257    1.654
   .attemptsPerGam   21.899    0.581   37.714    0.000   21.899    1.745
   .pssng_yrdsPrGm  148.475    4.622   32.121    0.000  148.475    1.599
   .passng_tdsPrGm    0.850    0.036   23.842    0.000    0.850    1.232
   .pssng_r_yrdsPG  133.536    5.766   23.158    0.000  133.536    1.104
   .pssng_yrds__PG   87.878    2.982   29.472    0.000   87.878    1.341
   .pssng_frst_dPG    7.166    0.228   31.455    0.000    7.166    1.578
   .avg_cmpltd_r_y    5.555    0.100   55.687    0.000    5.555    3.851
   .avg_ntndd_r_yr    7.934    0.107   74.361    0.000    7.934    5.669
   .aggressiveness   16.707    0.418   39.977    0.000   16.707    3.249
   .mx_cmpltd_r_ds   37.862    0.434   87.240    0.000   37.862    6.387
   .avg_air_distnc   21.196    0.108  196.597    0.000   21.196   14.668
   .max_air_distnc   46.728    0.425  109.894    0.000   46.728    8.775
   .avg_r_yrds_t_s   -1.077    0.113   -9.561    0.000   -1.077   -0.726
   .passing_cpoe     -2.755    0.429   -6.419    0.000   -2.755   -0.232
   .pass_comp_pct     0.589    0.004  142.846    0.000    0.589    4.473
   .passer_rating    79.853    1.208   66.096    0.000   79.853    4.300
   .cmpltn_prcnt__   -2.411    0.411   -5.868    0.000   -2.411   -0.420

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .completinsPrGm    1.707    0.123   13.838    0.000    1.707    0.027
   .attemptsPerGam    8.050    0.457   17.633    0.000    8.050    0.051
   .pssng_yrdsPrGm   99.775    9.145   10.911    0.000   99.775    0.012
   .passng_tdsPrGm    0.121    0.007   17.385    0.000    0.121    0.254
   .pssng_r_yrdsPG 7628.827  517.077   14.754    0.000 7628.827    0.522
   .pssng_yrds__PG 1742.218  130.951   13.304    0.000 1742.218    0.406
   .pssng_frst_dPG    0.336    0.025   13.212    0.000    0.336    0.016
   .avg_cmpltd_r_y    1.914    0.157   12.167    0.000    1.914    0.919
   .avg_ntndd_r_yr    1.914    0.170   11.280    0.000    1.914    0.977
   .aggressiveness   26.370    3.307    7.973    0.000   26.370    0.997
   .mx_cmpltd_r_ds   27.314    2.789    9.794    0.000   27.314    0.777
   .avg_air_distnc    2.087    0.175   11.912    0.000    2.087    1.000
   .max_air_distnc   24.548    2.486    9.875    0.000   24.548    0.866
   .avg_r_yrds_t_s    2.102    0.210   10.026    0.000    2.102    0.955
   .passing_cpoe    128.838   13.709    9.398    0.000  128.838    0.915
   .pass_comp_pct     0.016    0.001   11.749    0.000    0.016    0.914
   .passer_rating   209.097   20.024   10.442    0.000  209.097    0.606
   .cmpltn_prcnt__   24.103    2.730    8.830    0.000   24.103    0.731
    F1                1.000                               1.000    1.000

R-Square:
                   Estimate
    completinsPrGm    0.973
    attemptsPerGam    0.949
    pssng_yrdsPrGm    0.988
    passng_tdsPrGm    0.746
    pssng_r_yrdsPG    0.478
    pssng_yrds__PG    0.594
    pssng_frst_dPG    0.984
    avg_cmpltd_r_y    0.081
    avg_ntndd_r_yr    0.023
    aggressiveness    0.003
    mx_cmpltd_r_ds    0.223
    avg_air_distnc    0.000
    max_air_distnc    0.134
    avg_r_yrds_t_s    0.045
    passing_cpoe      0.085
    pass_comp_pct     0.086
    passer_rating     0.394
    cmpltn_prcnt__    0.269

The one-factor model did not fit well according to the fit indices:

Code

lavaan::fitMeasures(
  cfa1factor_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
           16714.647              135.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
            8844.902              135.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.890            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.248                0.613 
                 tli                 srmr         rmsea.robust 
               0.561                0.378                0.568 
          cfi.robust           tli.robust 
               0.222                0.119

Code

lavaan::residuals(
  cfa1factor_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                          0.023  0.000                    
passing_yardsPerGame                    -0.003 -0.003     0.000          
passing_tdsPerGame                      -0.026 -0.050     0.011     0.000
passing_air_yardsPerGame                 0.012  0.009     0.003    -0.021
passing_yards_after_catchPerGame        -0.008  0.014     0.009     0.005
passing_first_downsPerGame              -0.001 -0.006     0.001     0.014
avg_completed_air_yards                 -0.401 -0.422    -0.368    -0.280
avg_intended_air_yards                  -0.302 -0.323    -0.283    -0.220
aggressiveness                          -0.117 -0.102    -0.113    -0.098
max_completed_air_distance              -0.222 -0.239    -0.188    -0.112
avg_air_distance                        -0.210 -0.230    -0.190    -0.141
max_air_distance                        -0.205 -0.224    -0.201    -0.153
avg_air_yards_to_sticks                 -0.340 -0.362    -0.320    -0.241
passing_cpoe                            -0.091 -0.176    -0.098    -0.037
pass_comp_pct                           -0.001 -0.087    -0.020     0.021
passer_rating                           -0.274 -0.354    -0.252    -0.082
completion_percentage_above_expectation -0.351 -0.434    -0.364    -0.275
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame            -0.430  0.000                   
passing_first_downsPerGame                   0.000 -0.003      0.000        
avg_completed_air_yards                      0.357 -0.858     -0.371   0.000
avg_intended_air_yards                       0.456 -0.789     -0.283   0.910
aggressiveness                               0.333 -0.436     -0.099   0.647
max_completed_air_distance                   0.286 -0.537     -0.216   0.547
avg_air_distance                             0.489 -0.686     -0.196   0.913
max_air_distance                             0.410 -0.614     -0.213   0.607
avg_air_yards_to_sticks                      0.411 -0.802     -0.315   0.884
passing_cpoe                                 0.173 -0.335     -0.089   0.197
pass_comp_pct                               -0.030 -0.054     -0.010  -0.217
passer_rating                               -0.144 -0.282     -0.247  -0.135
completion_percentage_above_expectation      0.057 -0.604     -0.353   0.243
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                            0.671  0.000                     
max_completed_air_distance                0.553  0.377  0.000              
avg_air_distance                          0.971  0.651  0.621  0.000       
max_air_distance                          0.718  0.465  0.596  0.758  0.000
avg_air_yards_to_sticks                   0.953  0.671  0.538  0.957  0.683
passing_cpoe                              0.218 -0.051  0.262  0.231  0.231
pass_comp_pct                            -0.214 -0.354 -0.021 -0.205 -0.088
passer_rating                            -0.109 -0.232  0.007 -0.053 -0.078
completion_percentage_above_expectation   0.291  0.114  0.206  0.334  0.222
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                             0.205  0.000                     
pass_comp_pct                           -0.229  0.778  0.000              
passer_rating                           -0.133  0.669  0.700  0.000       
completion_percentage_above_expectation  0.261  0.811  0.641  0.466  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                  -0.279 
                 avg_intended_air_yards                          aggressiveness 
                                 -0.352                                  -0.161 
             max_completed_air_distance                        avg_air_distance 
                                 -0.200                                  -0.331 
                       max_air_distance                 avg_air_yards_to_sticks 
                                 -0.363                                  -0.322 
                           passing_cpoe                           pass_comp_pct 
                                 -0.228                                   0.013 
                          passer_rating completion_percentage_above_expectation 
                                 -0.054                                  -0.262

Code

lavaan::modificationindices(
  cfa1factor_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

cfa1factor_factorScores <- lavaan::lavPredict(cfa1factor_fit)

head(cfa1factor_factorScores)

              F1
[1,] -0.07070007
[2,]  0.15971703
[3,]  0.41749810
[4,]  0.11237113
[5,]  0.88378446
[6,] -0.07485054

The path diagram of the one-factor CFA model is in Figure 22.20.

Code

lavaanPlot::lavaanPlot(
  cfa1factor_fit,
  coefs = TRUE,
  #covs = TRUE,
  stand = TRUE)

Figure 22.20: Path Diagram of the One-Factor Confirmatory Factor Analysis Model.

Code

lavaangui::plot_lavaan(cfa1factor_fit)

Below is the syntax for the two-factor CFA model:

Code

cfa2factor_syntax <- '
 #Factor loadings
 F1 =~ completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame
 
 F2 =~ avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks + passing_cpoe + pass_comp_pct + 
 passer_rating + completion_percentage_above_expectation
'

Code

cfa2factor_fit <- lavaan::cfa(
  cfa2factor_syntax,
  data = dataForFA,
  missing = "ML",
  estimator = "MLR",
  bounds = "standard",
  std.lv = TRUE,
  cluster = "player_id", # account for nested data within the same player (i.e., longitudinal data)
  em.h1.iter.max = 2000000)

Code

summary(
  cfa2factor_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 3160 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        55
  Row rank of the constraints matrix                37

  Number of observations                          1997
  Number of clusters [player_id]                   397
  Number of missing patterns                         4

Model Test User Model:
                                               Standard      Scaled
  Test Statistic                              13476.780    6885.368
  Degrees of freedom                                134         134
  P-value (Chi-square)                            0.000       0.000
  Scaling correction factor                                   1.957
    Yuan-Bentler correction (Mplus variant)                        

Model Test Baseline Model:

  Test statistic                             42942.078   22341.153
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.922

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.688       0.696
  Tucker-Lewis Index (TLI)                       0.644       0.653
                                                                  
  Robust Comparative Fit Index (CFI)                         0.507
  Robust Tucker-Lewis Index (TLI)                            0.437

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -60738.707  -60738.707
  Scaling correction factor                                  2.381
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  2.081
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              121587.413  121587.413
  Bayesian (BIC)                            121895.380  121895.380
  Sample-size adjusted Bayesian (SABIC)     121720.643  121720.643

Root Mean Square Error of Approximation:

  RMSEA                                          0.223       0.159
  90 Percent confidence interval - lower         0.220       0.157
  90 Percent confidence interval - upper         0.226       0.161
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       1.000
                                                                  
  Robust RMSEA                                               0.454
  90 Percent confidence interval - lower                     0.436
  90 Percent confidence interval - upper                     0.472
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.343       0.343

Parameter Estimates:

  Standard errors                       Robust.cluster
  Information                                 Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
  F1 =~                                                                  
    completinsPrGm    7.900    0.122    64.949    0.000    7.900    0.987
    attemptsPerGam   12.212    0.168    72.669    0.000   12.212    0.975
    pssng_yrdsPrGm   92.101    1.471    62.601    0.000   92.101    0.994
    passng_tdsPrGm    0.594    0.019    31.404    0.000    0.594    0.862
    pssng_r_yrdsPG   83.486    3.587    23.277    0.000   83.486    0.691
    pssng_yrds__PG   50.420    1.510    33.395    0.000   50.420    0.770
    pssng_frst_dPG    4.495    0.079    57.239    0.000    4.495    0.992
  F2 =~                                                                  
    avg_cmpltd_r_y    2.661       NA                       2.661    0.951
    avg_ntndd_r_yr    3.245    0.090    36.067    0.000    3.245    0.995
    aggressiveness    4.684    0.645     7.257    0.000    4.684    0.702
    mx_cmpltd_r_ds    6.872    0.563    12.209    0.000    6.872    0.817
    avg_air_distnc    3.012    0.127    23.630    0.000    3.012    0.974
    max_air_distnc    6.318    0.634     9.967    0.000    6.318    0.818
    avg_r_yrds_t_s    3.255                                3.255    0.989
    passing_cpoe     10.346    0.486    21.271    0.000   10.346    0.872
    pass_comp_pct     0.109    0.005    22.026    0.000    0.109    0.842
    passer_rating     7.181    2.706     2.653    0.008    7.181    0.396
    cmpltn_prcnt__    2.379    0.791     3.006    0.003    2.379    0.409

Covariances:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
  F1 ~~                                                                  
    F2                0.283    0.040     7.112    0.000    0.283    0.283

Intercepts:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
   .completinsPrGm   13.257    0.391    33.868    0.000   13.257    1.657
   .attemptsPerGam   21.899    0.581    37.714    0.000   21.899    1.748
   .pssng_yrdsPrGm  148.474    4.622    32.121    0.000  148.474    1.602
   .passng_tdsPrGm    0.850    0.036    23.842    0.000    0.850    1.234
   .pssng_r_yrdsPG  133.536    5.766    23.158    0.000  133.536    1.105
   .pssng_yrds__PG   87.878    2.982    29.472    0.000   87.878    1.342
   .pssng_frst_dPG    7.166    0.228    31.455    0.000    7.166    1.581
   .avg_cmpltd_r_y    5.110    0.111    46.052    0.000    5.110    1.826
   .avg_ntndd_r_yr    7.248    0.136    53.336    0.000    7.248    2.223
   .aggressiveness   15.430    0.353    43.689    0.000   15.430    2.313
   .mx_cmpltd_r_ds   37.571    0.391    96.001    0.000   37.571    4.468
   .avg_air_distnc   20.477    0.137   148.949    0.000   20.477    6.618
   .max_air_distnc   46.154    0.409   112.854    0.000   46.154    5.977
   .avg_r_yrds_t_s   -1.714    0.132   -12.965    0.000   -1.714   -0.521
   .passing_cpoe     -3.269    0.375    -8.723    0.000   -3.269   -0.275
   .pass_comp_pct     0.589    0.004   144.615    0.000    0.589    4.561
   .passer_rating    83.866    1.225    68.468    0.000   83.866    4.626
   .cmpltn_prcnt__   -1.517    0.360    -4.208    0.000   -1.517   -0.261

Variances:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
   .completinsPrGm    1.630    0.123    13.288    0.000    1.630    0.025
   .attemptsPerGam    7.748    0.462    16.787    0.000    7.748    0.049
   .pssng_yrdsPrGm  106.868    9.630    11.098    0.000  106.868    0.012
   .passng_tdsPrGm    0.122    0.007    17.324    0.000    0.122    0.257
   .pssng_r_yrdsPG 7625.675    0.352 21650.338    0.000 7625.675    0.522
   .pssng_yrds__PG 1742.807  132.448    13.158    0.000 1742.807    0.407
   .pssng_frst_dPG    0.344    0.027    12.931    0.000    0.344    0.017
   .avg_cmpltd_r_y    0.749    0.078     9.588    0.000    0.749    0.096
   .avg_ntndd_r_yr    0.097    0.027     3.538    0.000    0.097    0.009
   .aggressiveness   22.575    2.997     7.532    0.000   22.575    0.507
   .mx_cmpltd_r_ds   23.492    2.119    11.089    0.000   23.492    0.332
   .avg_air_distnc    0.501    0.042    11.867    0.000    0.501    0.052
   .max_air_distnc   19.726    2.181     9.044    0.000   19.726    0.331
   .avg_r_yrds_t_s    0.230    0.039     5.854    0.000    0.230    0.021
   .passing_cpoe     33.805    3.145    10.748    0.000   33.805    0.240
   .pass_comp_pct     0.005    0.000    11.396    0.000    0.005    0.292
   .passer_rating   277.148   23.550    11.768    0.000  277.148    0.843
   .cmpltn_prcnt__   28.151    2.897     9.718    0.000   28.151    0.833
    F1                1.000                                1.000    1.000
    F2                1.000                                1.000    1.000

R-Square:
                   Estimate
    completinsPrGm    0.975
    attemptsPerGam    0.951
    pssng_yrdsPrGm    0.988
    passng_tdsPrGm    0.743
    pssng_r_yrdsPG    0.478
    pssng_yrds__PG    0.593
    pssng_frst_dPG    0.983
    avg_cmpltd_r_y    0.904
    avg_ntndd_r_yr    0.991
    aggressiveness    0.493
    mx_cmpltd_r_ds    0.668
    avg_air_distnc    0.948
    max_air_distnc    0.669
    avg_r_yrds_t_s    0.979
    passing_cpoe      0.760
    pass_comp_pct     0.708
    passer_rating     0.157
    cmpltn_prcnt__    0.167

The two-factor model did not fit well according to the fit indices:

Code

lavaan::fitMeasures(
  cfa2factor_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
           13476.780              134.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
            6885.368              134.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.957            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.223                0.688 
                 tli                 srmr         rmsea.robust 
               0.644                0.343                0.454 
          cfi.robust           tli.robust 
               0.507                0.437

Code

lavaan::residuals(
  cfa2factor_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                          0.022  0.000                    
passing_yardsPerGame                    -0.003 -0.003     0.000          
passing_tdsPerGame                      -0.024 -0.049     0.013     0.000
passing_air_yardsPerGame                 0.012  0.009     0.004    -0.019
passing_yards_after_catchPerGame        -0.008  0.014     0.011     0.007
passing_first_downsPerGame              -0.002 -0.007     0.002     0.016
avg_completed_air_yards                 -0.387 -0.408    -0.353    -0.267
avg_intended_air_yards                  -0.431 -0.450    -0.412    -0.332
aggressiveness                          -0.366 -0.347    -0.363    -0.316
max_completed_air_distance               0.015 -0.005     0.051     0.096
avg_air_distance                        -0.463 -0.480    -0.445    -0.362
max_air_distance                        -0.072 -0.093    -0.067    -0.036
avg_air_yards_to_sticks                 -0.407 -0.428    -0.387    -0.299
passing_cpoe                            -0.047 -0.133    -0.054     0.003
pass_comp_pct                            0.053 -0.033     0.035     0.070
passer_rating                            0.235  0.147     0.260     0.363
completion_percentage_above_expectation  0.046 -0.042     0.036     0.074
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame            -0.429  0.000                   
passing_first_downsPerGame                   0.001 -0.002      0.000        
avg_completed_air_yards                      0.367 -0.847     -0.357   0.000
avg_intended_air_yards                       0.366 -0.889     -0.412   0.006
aggressiveness                               0.159 -0.630     -0.349  -0.036
max_completed_air_distance                   0.452 -0.352      0.022  -0.096
avg_air_distance                             0.312 -0.884     -0.450  -0.007
max_air_distance                             0.503 -0.510     -0.080  -0.067
avg_air_yards_to_sticks                      0.364 -0.854     -0.382   0.004
passing_cpoe                                 0.204 -0.301     -0.045  -0.549
pass_comp_pct                                0.009 -0.011      0.045  -0.934
passer_rating                                0.212  0.115      0.264  -0.333
completion_percentage_above_expectation      0.336 -0.294      0.047   0.002
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                           -0.036  0.000                     
max_completed_air_distance               -0.189 -0.222  0.000              
avg_air_distance                          0.005 -0.033 -0.166  0.000       
max_air_distance                         -0.041 -0.129  0.101 -0.031  0.000
avg_air_yards_to_sticks                   0.000 -0.034 -0.170 -0.002 -0.048
passing_cpoe                             -0.606 -0.678 -0.313 -0.612 -0.375
pass_comp_pct                            -1.008 -0.961 -0.570 -1.019 -0.669
passer_rating                            -0.408 -0.544 -0.020 -0.426 -0.172
completion_percentage_above_expectation  -0.037 -0.201  0.116 -0.054  0.078
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                            -0.595  0.000                     
pass_comp_pct                           -0.999  0.130  0.000              
passer_rating                           -0.392  0.507  0.551  0.000       
completion_percentage_above_expectation -0.034  0.606  0.450  0.630  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                  -0.111 
                 avg_intended_air_yards                          aggressiveness 
                                 -0.151                                   0.032 
             max_completed_air_distance                        avg_air_distance 
                                 -0.158                                  -0.107 
                       max_air_distance                 avg_air_yards_to_sticks 
                                 -0.287                                  -0.127 
                           passing_cpoe                           pass_comp_pct 
                                 -0.189                                   0.009 
                          passer_rating completion_percentage_above_expectation 
                                 -0.210                                  -0.339

Code

lavaan::modificationindices(
  cfa2factor_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

cfa2factor_factorScores <- lavaan::lavPredict(cfa2factor_fit)

head(cfa2factor_factorScores)

              F1         F2
[1,] -0.07860312  0.7872502
[2,]  0.15963921 -0.1818852
[3,]  0.42522626 -0.2978555
[4,]  0.10541382  0.6674304
[5,]  0.88641978  0.1166485
[6,] -0.07102190 -0.4041707

The path diagram of the two-factor CFA model is in Figure 22.21.

Code

lavaanPlot::lavaanPlot(
  cfa2factor_fit,
  coefs = TRUE,
  covs = TRUE,
  stand = TRUE)

Figure 22.21: Path Diagram of the Two-Factor Confirmatory Factor Analysis Model.

Code

lavaangui::plot_lavaan(cfa2factor_fit)

Because the one-factor model is nested within the two-factor model, we can compare them using a chi-square difference test. The two factor model fit considerably better than the one-factor model in terms of a lower chi-square value:

Code

anova(
  cfa2factor_fit,
  cfa1factor_fit
)

Below is the syntax for the three-factor model:

Code

cfa3factor_syntax <- '
 #Factor loadings
 F1 =~ completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame
 
 F2 =~ avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks
 
 F3 =~ passing_cpoe + pass_comp_pct + passer_rating + completion_percentage_above_expectation
'

Code

cfa3factor_fit <- lavaan::cfa(
  cfa3factor_syntax,
  data = dataForFA,
  missing = "ML",
  estimator = "MLR",
  bounds = "standard",
  std.lv = TRUE,
  cluster = "player_id", # account for nested data within the same player (i.e., longitudinal data)
  em.h1.iter.max = 2000000)

Code

summary(
  cfa3factor_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 229 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        57
  Row rank of the constraints matrix                39

  Number of observations                          1997
  Number of clusters [player_id]                   397
  Number of missing patterns                         4

Model Test User Model:
                                               Standard      Scaled
  Test Statistic                              10891.312    5784.269
  Degrees of freedom                                132         132
  P-value (Chi-square)                            0.000       0.000
  Scaling correction factor                                   1.883
    Yuan-Bentler correction (Mplus variant)                        

Model Test Baseline Model:

  Test statistic                             42942.078   22341.153
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.922

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.749       0.745
  Tucker-Lewis Index (TLI)                       0.709       0.705
                                                                  
  Robust Comparative Fit Index (CFI)                         0.713
  Robust Tucker-Lewis Index (TLI)                            0.667

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -59445.973  -59445.973
  Scaling correction factor                                  2.538
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  2.081
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              119005.945  119005.945
  Bayesian (BIC)                            119325.111  119325.111
  Sample-size adjusted Bayesian (SABIC)     119144.019  119144.019

Root Mean Square Error of Approximation:

  RMSEA                                          0.202       0.146
  90 Percent confidence interval - lower         0.199       0.144
  90 Percent confidence interval - upper         0.205       0.149
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       1.000
                                                                  
  Robust RMSEA                                               0.349
  90 Percent confidence interval - lower                     0.331
  90 Percent confidence interval - upper                     0.368
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.324       0.324

Parameter Estimates:

  Standard errors                       Robust.cluster
  Information                                 Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
  F1 =~                                                                  
    completinsPrGm    7.904    0.121    65.467    0.000    7.904    0.987
    attemptsPerGam   12.219    0.168    72.739    0.000   12.219    0.975
    pssng_yrdsPrGm   92.159    1.463    62.996    0.000   92.159    0.994
    passng_tdsPrGm    0.594    0.019    31.607    0.000    0.594    0.862
    pssng_r_yrdsPG   83.536    3.579    23.342    0.000   83.536    0.691
    pssng_yrds__PG   50.450    1.515    33.293    0.000   50.450    0.770
    pssng_frst_dPG    4.498    0.078    57.603    0.000    4.498    0.992
  F2 =~                                                                  
    avg_cmpltd_r_y    1.111    0.064    17.451    0.000    1.111    0.781
    avg_ntndd_r_yr    1.388    0.064    21.820    0.000    1.388    0.992
    aggressiveness    2.008    0.300     6.689    0.000    2.008    0.390
    mx_cmpltd_r_ds    2.680    0.277     9.685    0.000    2.680    0.475
    avg_air_distnc    1.273    0.066    19.210    0.000    1.273    0.877
    max_air_distnc    2.618    0.298     8.778    0.000    2.618    0.506
    avg_r_yrds_t_s    1.383    0.079    17.476    0.000    1.383    0.937
  F3 =~                                                                  
    passing_cpoe     12.045    0.482    25.013    0.000   12.045    0.997
    pass_comp_pct     0.119    0.005    24.550    0.000    0.119    0.919
    passer_rating    25.689                               25.689    0.919
    cmpltn_prcnt__   10.144    0.525    19.313    0.000   10.144    0.966

Covariances:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
  F1 ~~                                                                  
    F2                0.154    0.076     2.035    0.042    0.154    0.154
    F3                0.302    0.037     8.066    0.000    0.302    0.302
  F2 ~~                                                                  
    F3                0.072    0.132     0.546    0.585    0.072    0.072

Intercepts:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
   .completinsPrGm   13.257    0.391    33.868    0.000   13.257    1.656
   .attemptsPerGam   21.899    0.581    37.714    0.000   21.899    1.747
   .pssng_yrdsPrGm  148.475    4.622    32.121    0.000  148.475    1.601
   .passng_tdsPrGm    0.850    0.036    23.842    0.000    0.850    1.233
   .pssng_r_yrdsPG  133.536    5.766    23.158    0.000  133.536    1.105
   .pssng_yrds__PG   87.878    2.982    29.472    0.000   87.878    1.342
   .pssng_frst_dPG    7.166    0.228    31.455    0.000    7.166    1.580
   .avg_cmpltd_r_y    5.672    0.095    59.955    0.000    5.672    3.988
   .avg_ntndd_r_yr    7.931    0.108    73.627    0.000    7.931    5.665
   .aggressiveness   16.416    0.295    55.593    0.000   16.416    3.192
   .mx_cmpltd_r_ds   39.038    0.340   114.655    0.000   39.038    6.919
   .avg_air_distnc   21.112    0.104   203.239    0.000   21.112   14.550
   .max_air_distnc   47.490    0.332   143.040    0.000   47.490    9.176
   .avg_r_yrds_t_s   -1.028    0.110    -9.351    0.000   -1.028   -0.696
   .passing_cpoe     -3.494    0.376    -9.298    0.000   -3.494   -0.289
   .pass_comp_pct     0.589    0.004   144.212    0.000    0.589    4.533
   .passer_rating    80.620    0.994    81.092    0.000   80.620    2.885
   .cmpltn_prcnt__   -2.910    0.353    -8.251    0.000   -2.910   -0.277

Variances:
                   Estimate  Std.Err  z-value   P(>|z|)   Std.lv  Std.all
   .completinsPrGm    1.635    0.123    13.303    0.000    1.635    0.026
   .attemptsPerGam    7.773    0.461    16.855    0.000    7.773    0.049
   .pssng_yrdsPrGm  106.417    9.610    11.074    0.000  106.417    0.012
   .passng_tdsPrGm    0.122    0.007    17.325    0.000    0.122    0.256
   .pssng_r_yrdsPG 7630.194    0.351 21735.014    0.000 7630.194    0.522
   .pssng_yrds__PG 1742.939  132.450    13.159    0.000 1742.939    0.406
   .pssng_frst_dPG    0.343    0.026    12.950    0.000    0.343    0.017
   .avg_cmpltd_r_y    0.787    0.084     9.389    0.000    0.787    0.389
   .avg_ntndd_r_yr    0.033    0.021     1.540    0.123    0.033    0.017
   .aggressiveness   22.423    2.889     7.762    0.000   22.423    0.848
   .mx_cmpltd_r_ds   24.651    2.208    11.167    0.000   24.651    0.774
   .avg_air_distnc    0.484    0.037    13.026    0.000    0.484    0.230
   .max_air_distnc   19.928    2.186     9.117    0.000   19.928    0.744
   .avg_r_yrds_t_s    0.268    0.043     6.167    0.000    0.268    0.123
   .passing_cpoe      0.908    1.098     0.827    0.408    0.908    0.006
   .pass_comp_pct     0.003    0.000     7.732    0.000    0.003    0.155
   .passer_rating   120.958   17.217     7.025    0.000  120.958    0.155
   .cmpltn_prcnt__    7.377    1.161     6.355    0.000    7.377    0.067
    F1                1.000                                1.000    1.000
    F2                1.000                                1.000    1.000
    F3                1.000                                1.000    1.000

R-Square:
                   Estimate
    completinsPrGm    0.974
    attemptsPerGam    0.951
    pssng_yrdsPrGm    0.988
    passng_tdsPrGm    0.744
    pssng_r_yrdsPG    0.478
    pssng_yrds__PG    0.594
    pssng_frst_dPG    0.983
    avg_cmpltd_r_y    0.611
    avg_ntndd_r_yr    0.983
    aggressiveness    0.152
    mx_cmpltd_r_ds    0.226
    avg_air_distnc    0.770
    max_air_distnc    0.256
    avg_r_yrds_t_s    0.877
    passing_cpoe      0.994
    pass_comp_pct     0.845
    passer_rating     0.845
    cmpltn_prcnt__    0.933

The three-factor model did not fit well according to fit indices:

Code

lavaan::fitMeasures(
  cfa3factor_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
           10891.312              132.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
            5784.269              132.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.883            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.202                0.749 
                 tli                 srmr         rmsea.robust 
               0.709                0.324                0.349 
          cfi.robust           tli.robust 
               0.713                0.667

However, we know that the three-factor model fit improved considerably when accounting for correlated residuals. So, we plan to examine modification indices to see if we can account for covariances between variables that were not explained by their underlying latent factors.

Code

lavaan::residuals(
  cfa3factor_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                          0.022  0.000                    
passing_yardsPerGame                    -0.003 -0.003     0.000          
passing_tdsPerGame                      -0.024 -0.049     0.013     0.000
passing_air_yardsPerGame                 0.012  0.009     0.004    -0.019
passing_yards_after_catchPerGame        -0.008  0.014     0.010     0.007
passing_first_downsPerGame              -0.002 -0.007     0.002     0.016
avg_completed_air_yards                 -0.240 -0.263    -0.205    -0.138
avg_intended_air_yards                  -0.303 -0.324    -0.284    -0.220
aggressiveness                          -0.229 -0.212    -0.225    -0.196
max_completed_air_distance               0.172  0.149     0.209     0.233
avg_air_distance                        -0.324 -0.342    -0.305    -0.240
max_air_distance                         0.080  0.057     0.086     0.097
avg_air_yards_to_sticks                 -0.272 -0.295    -0.251    -0.182
passing_cpoe                            -0.101 -0.186    -0.108    -0.044
pass_comp_pct                            0.014 -0.071    -0.004     0.036
passer_rating                            0.071 -0.014     0.096     0.220
completion_percentage_above_expectation -0.128 -0.213    -0.139    -0.078
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame            -0.429  0.000                   
passing_first_downsPerGame                   0.000 -0.002      0.000        
avg_completed_air_yards                      0.470 -0.732     -0.209   0.000
avg_intended_air_yards                       0.455 -0.789     -0.284   0.178
aggressiveness                               0.255 -0.523     -0.212   0.327
max_completed_air_distance                   0.562 -0.230      0.180   0.310
avg_air_distance                             0.410 -0.775     -0.310   0.233
max_air_distance                             0.610 -0.392      0.073   0.315
avg_air_yards_to_sticks                      0.459 -0.749     -0.247   0.213
passing_cpoe                                 0.166 -0.343     -0.098   0.224
pass_comp_pct                               -0.018 -0.042      0.006  -0.185
passer_rating                                0.098 -0.013      0.100  -0.008
completion_percentage_above_expectation      0.214 -0.429     -0.128   0.336
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                            0.276  0.000                     
max_completed_air_distance                0.153  0.166  0.000              
avg_air_distance                          0.104  0.308  0.213  0.000       
max_air_distance                          0.271  0.247  0.529  0.322  0.000
avg_air_yards_to_sticks                   0.057  0.294  0.194  0.139  0.288
passing_cpoe                              0.191 -0.094  0.365  0.174  0.302
pass_comp_pct                            -0.236 -0.396  0.086 -0.257 -0.014
passer_rating                            -0.080 -0.292  0.272 -0.099  0.118
completion_percentage_above_expectation   0.301  0.059  0.418  0.283  0.377
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                             0.200  0.000                     
pass_comp_pct                           -0.228 -0.053  0.000              
passer_rating                           -0.062 -0.065  0.039  0.000       
completion_percentage_above_expectation  0.306  0.000 -0.094 -0.096  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                  -0.323 
                 avg_intended_air_yards                          aggressiveness 
                                 -0.351                                  -0.117 
             max_completed_air_distance                        avg_air_distance 
                                 -0.370                                  -0.304 
                       max_air_distance                 avg_air_yards_to_sticks 
                                 -0.465                                  -0.337 
                           passing_cpoe                           pass_comp_pct 
                                 -0.173                                   0.012 
                          passer_rating completion_percentage_above_expectation 
                                 -0.084                                  -0.219

Below are the modification indices, suggesting potential model modifications that would improve model fit such as correlated residuals and cross-loadings.

Code

lavaan::modificationindices(
  cfa3factor_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

cfa3factor_factorScores <- lavaan::lavPredict(cfa3factor_fit)

head(cfa3factor_factorScores)

              F1          F2         F3
[1,] -0.07764820  0.01332921  0.8697538
[2,]  0.15943272  0.01738155 -0.2043130
[3,]  0.42439957  0.05193403 -0.3436424
[4,]  0.10579471  0.03363407  0.6454753
[5,]  0.88603716  0.13559994  0.2429657
[6,] -0.07133264 -0.02288244 -0.4424421

The path diagram of the three-factor CFA model is in Figure 22.22.

Code

lavaanPlot::lavaanPlot(
  cfa3factor_fit,
  coefs = TRUE,
  covs = TRUE,
  stand = TRUE)

Figure 22.22: Path Diagram of the Three-Factor Confirmatory Factor Analysis Model.

Code

lavaangui::plot_lavaan(cfa3factor_fit)

Because the two-factor model is nested within the three-factor model, we can compare them using a chi-square difference test. The three factor model fit considerably better than the two-factor model in terms of a lower chi-square value:

Code

anova(
  cfa3factor_fit,
  cfa2factor_fit
)

Based on the model modifications suggested by the modification indices for the three-factor model, we modify the model to account for correlated residuals, using the syntax below:

Code

cfa3factorModified_syntax <- '
 # Factor loadings
 F1 =~ NA*completionsPerGame + attemptsPerGame + passing_yardsPerGame + passing_tdsPerGame + 
 passing_air_yardsPerGame + passing_yards_after_catchPerGame + passing_first_downsPerGame
 
 F2 =~ NA*avg_completed_air_yards + avg_intended_air_yards + aggressiveness + max_completed_air_distance + 
 avg_air_distance + max_air_distance + avg_air_yards_to_sticks
 
 F3 =~ NA*passing_cpoe + pass_comp_pct + passer_rating + completion_percentage_above_expectation
 
 # Cross loadings
 #F3 =~ attemptsPerGame

 # Correlated residuals
 passing_air_yardsPerGame ~~ passing_yards_after_catchPerGame
 completionsPerGame ~~ attemptsPerGame
 attemptsPerGame ~~ passing_yards_after_catchPerGame
 passing_yards_after_catchPerGame ~~ passing_first_downsPerGame
 completionsPerGame ~~ passing_first_downsPerGame
 attemptsPerGame ~~ passing_tdsPerGame
 completionsPerGame ~~ passing_tdsPerGame
 passing_tdsPerGame ~~ passing_air_yardsPerGame
 
 max_completed_air_distance ~~ max_air_distance
 avg_completed_air_yards ~~ max_completed_air_distance
 avg_completed_air_yards ~~ avg_air_distance
 avg_completed_air_yards ~~ max_air_distance
 avg_intended_air_yards ~~ max_completed_air_distance
 
 completionsPerGame ~~ pass_comp_pct
 passing_yardsPerGame ~~ avg_completed_air_yards
 passing_yardsPerGame ~~ max_completed_air_distance
 passing_tdsPerGame ~~ passer_rating
 aggressiveness ~~ completion_percentage_above_expectation
 
 # Variances
 F1 ~~ 1*F1
 F2 ~~ 1*F2
'

Code

cfa3factorModified_fit <- lavaan::cfa(
  cfa3factorModified_syntax,
  data = dataForFA,
  missing = "ML",
  estimator = "MLR",
  bounds = "standard",
  #std.lv = TRUE,
  cluster = "player_id", # account for nested data within the same player (i.e., longitudinal data)
  em.h1.iter.max = 2000000)

Code

summary(
  cfa3factorModified_fit,
  fit.measures = TRUE,
  standardized = TRUE,
  rsquare = TRUE)

lavaan 0.6-20 ended normally after 421 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        76
  Row rank of the constraints matrix                40

  Number of observations                          1997
  Number of clusters [player_id]                   397
  Number of missing patterns                         4

Model Test User Model:
                                              Standard      Scaled
  Test Statistic                              2225.771    1169.232
  Degrees of freedom                               113         113
  P-value (Chi-square)                           0.000       0.000
  Scaling correction factor                                  1.904
    Yuan-Bentler correction (Mplus variant)                       

Model Test Baseline Model:

  Test statistic                             42942.078   22341.153
  Degrees of freedom                               153         153
  P-value                                        0.000       0.000
  Scaling correction factor                                  1.922

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.951       0.952
  Tucker-Lewis Index (TLI)                       0.933       0.936
                                                                  
  Robust Comparative Fit Index (CFI)                         0.797
  Robust Tucker-Lewis Index (TLI)                            0.725

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)             -55113.202  -55113.202
  Scaling correction factor                                  2.344
      for the MLR correction                                      
  Loglikelihood unrestricted model (H1)     -54000.317  -54000.317
  Scaling correction factor                                  2.081
      for the MLR correction                                      
                                                                  
  Akaike (AIC)                              110378.405  110378.405
  Bayesian (BIC)                            110803.959  110803.959
  Sample-size adjusted Bayesian (SABIC)     110562.503  110562.503

Root Mean Square Error of Approximation:

  RMSEA                                          0.097       0.068
  90 Percent confidence interval - lower         0.093       0.066
  90 Percent confidence interval - upper         0.100       0.071
  P-value H_0: RMSEA <= 0.050                    0.000       0.000
  P-value H_0: RMSEA >= 0.080                    1.000       0.000
                                                                  
  Robust RMSEA                                               0.317
  90 Percent confidence interval - lower                     0.296
  90 Percent confidence interval - upper                     0.339
  P-value H_0: Robust RMSEA <= 0.050                         0.000
  P-value H_0: Robust RMSEA >= 0.080                         1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.322       0.322

Parameter Estimates:

  Standard errors                       Robust.cluster
  Information                                 Observed
  Observed information based on                Hessian

Latent Variables:
                   Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
  F1 =~                                                                   
    completinsPrGm     7.862    0.123   63.729    0.000     7.862    0.982
    attemptsPerGam    12.158    0.171   70.948    0.000    12.158    0.968
    pssng_yrdsPrGm    92.224    1.461   63.105    0.000    92.224    0.996
    passng_tdsPrGm     0.595    0.018   32.321    0.000     0.595    0.872
    pssng_r_yrdsPG    84.143    3.632   23.169    0.000    84.143    0.694
    pssng_yrds__PG    51.005    1.527   33.409    0.000    51.005    0.778
    pssng_frst_dPG     4.500    0.079   56.754    0.000     4.500    0.991
  F2 =~                                                                   
    avg_cmpltd_r_y     1.107    0.061   18.103    0.000     1.107    0.809
    avg_ntndd_r_yr     1.389    0.065   21.370    0.000     1.389    0.986
    aggressiveness     1.733    0.297    5.831    0.000     1.733    0.343
    mx_cmpltd_r_ds     2.871    0.280   10.256    0.000     2.871    0.507
    avg_air_distnc     1.284    0.067   19.212    0.000     1.284    0.881
    max_air_distnc     2.647    0.302    8.770    0.000     2.647    0.511
    avg_r_yrds_t_s     1.399    0.080   17.492    0.000     1.399    0.942
  F3 =~                                                                   
    passing_cpoe       5.768    0.225   25.679    0.000    12.150    0.991
    pass_comp_pct      0.055    0.003   21.865    0.000     0.117    0.926
    passer_rating     13.091    0.394   33.216    0.000    27.576    0.937
    cmpltn_prcnt__     5.201    0.259   20.092    0.000    10.957    0.974

Covariances:
                                      Estimate   Std.Err  z-value  P(>|z|)
 .passing_air_yardsPerGame ~~                                             
   .pssng_yrds__PG                    -3502.520   26.181 -133.783    0.000
 .completionsPerGame ~~                                                   
   .attemptsPerGam                        3.507    0.186   18.819    0.000
 .attemptsPerGame ~~                                                      
   .pssng_yrds__PG                       15.528    1.041   14.914    0.000
 .passing_yards_after_catchPerGame ~~                                     
   .pssng_frst_dPG                       -3.439    0.280  -12.302    0.000
 .completionsPerGame ~~                                                   
   .pssng_frst_dPG                        0.127    0.027    4.716    0.000
 .attemptsPerGame ~~                                                      
   .passng_tdsPrGm                       -0.438    0.035  -12.566    0.000
 .completionsPerGame ~~                                                   
   .passng_tdsPrGm                       -0.153    0.015  -10.454    0.000
 .passing_tdsPerGame ~~                                                   
   .pssng_r_yrdsPG                       -2.834    0.301   -9.418    0.000
 .max_completed_air_distance ~~                                           
   .max_air_distnc                       11.322    1.522    7.438    0.000
 .avg_completed_air_yards ~~                                              
   .mx_cmpltd_r_ds                        1.266    0.234    5.413    0.000
   .avg_air_distnc                       -0.111    0.024   -4.714    0.000
   .max_air_distnc                       -0.404    0.150   -2.696    0.007
 .avg_intended_air_yards ~~                                               
   .mx_cmpltd_r_ds                       -0.460    0.135   -3.412    0.001
 .completionsPerGame ~~                                                   
   .pass_comp_pct                         0.020    0.002   11.038    0.000
 .passing_yardsPerGame ~~                                                 
   .avg_cmpltd_r_y                        4.803    0.319   15.070    0.000
   .mx_cmpltd_r_ds                       17.208    1.452   11.850    0.000
 .passing_tdsPerGame ~~                                                   
   .passer_rating                         1.315    0.135    9.768    0.000
 .aggressiveness ~~                                                       
   .cmpltn_prcnt__                        5.701    1.036    5.504    0.000
  F1 ~~                                                                   
    F2                                    0.224    0.072    3.091    0.002
    F3                                    0.659    0.085    7.797    0.000
  F2 ~~                                                                   
    F3                                    0.189    0.280    0.675    0.500
   Std.lv   Std.all
                   
 -3502.520   -0.973
                   
     3.507    0.728
                   
    15.528    0.119
                   
    -3.439   -0.136
                   
     0.127    0.137
                   
    -0.438   -0.415
                   
    -0.153   -0.300
                   
    -2.834   -0.097
                   
    11.322    0.521
                   
     1.266    0.322
    -0.111   -0.201
    -0.404   -0.113
                   
    -0.460   -0.405
                   
     0.020    0.274
                   
     4.803    0.687
    17.208    0.406
                   
     1.315    0.382
                   
     5.701    0.469
                   
     0.224    0.224
     0.313    0.313
                   
     0.090    0.090

Intercepts:
                   Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
   .completinsPrGm    13.257    0.391   33.868    0.000    13.257    1.655
   .attemptsPerGam    21.899    0.581   37.714    0.000    21.899    1.743
   .pssng_yrdsPrGm   148.475    4.622   32.121    0.000   148.475    1.603
   .passng_tdsPrGm     0.850    0.036   23.842    0.000     0.850    1.245
   .pssng_r_yrdsPG   133.536    5.766   23.158    0.000   133.536    1.102
   .pssng_yrds__PG    87.878    2.982   29.472    0.000    87.878    1.340
   .pssng_frst_dPG     7.166    0.228   31.455    0.000     7.166    1.578
   .avg_cmpltd_r_y     5.646    0.090   62.996    0.000     5.646    4.125
   .avg_ntndd_r_yr     7.886    0.104   75.538    0.000     7.886    5.600
   .aggressiveness    16.380    0.292   56.193    0.000    16.380    3.243
   .mx_cmpltd_r_ds    38.966    0.342  113.793    0.000    38.966    6.885
   .avg_air_distnc    21.069    0.101  208.568    0.000    21.069   14.454
   .max_air_distnc    47.401    0.331  143.261    0.000    47.401    9.145
   .avg_r_yrds_t_s    -1.075    0.107  -10.011    0.000    -1.075   -0.724
   .passing_cpoe      -3.321    0.376   -8.843    0.000    -3.321   -0.271
   .pass_comp_pct      0.589    0.004  143.918    0.000     0.589    4.684
   .passer_rating     80.692    1.150   70.159    0.000    80.692    2.741
   .cmpltn_prcnt__    -2.936    0.370   -7.942    0.000    -2.936   -0.261

Variances:
                   Estimate   Std.Err  z-value  P(>|z|)   Std.lv   Std.all
    F1                 1.000                                1.000    1.000
    F2                 1.000                                1.000    1.000
   .completinsPrGm     2.322    0.112   20.793    0.000     2.322    0.036
   .attemptsPerGam     9.986    0.449   22.264    0.000     9.986    0.063
   .pssng_yrdsPrGm    75.352    6.410   11.755    0.000    75.352    0.009
   .passng_tdsPrGm     0.112    0.007   17.058    0.000     0.112    0.240
   .pssng_r_yrdsPG  7607.012   17.590  432.474    0.000  7607.012    0.518
   .pssng_yrds__PG  1701.901   28.113   60.537    0.000  1701.901    0.395
   .pssng_frst_dPG     0.374    0.029   12.748    0.000     0.374    0.018
   .avg_cmpltd_r_y     0.648    0.064   10.190    0.000     0.648    0.346
   .avg_ntndd_r_yr     0.054    0.020    2.662    0.008     0.054    0.027
   .aggressiveness    22.513    2.923    7.701    0.000    22.513    0.882
   .mx_cmpltd_r_ds    23.792    1.954   12.174    0.000    23.792    0.743
   .avg_air_distnc     0.476    0.037   12.804    0.000     0.476    0.224
   .max_air_distnc    19.859    2.180    9.111    0.000    19.859    0.739
   .avg_r_yrds_t_s     0.247    0.038    6.534    0.000     0.247    0.112
   .passing_cpoe       2.837    1.238    2.291    0.022     2.837    0.019
   .pass_comp_pct      0.002                                0.002    0.142
   .passer_rating    106.360   15.502    6.861    0.000   106.360    0.123
   .cmpltn_prcnt__     6.574    1.003    6.553    0.000     6.574    0.052
    F3                 4.438    0.404   10.982    0.000     1.000    1.000

R-Square:
                   Estimate 
    completinsPrGm     0.964
    attemptsPerGam     0.937
    pssng_yrdsPrGm     0.991
    passng_tdsPrGm     0.760
    pssng_r_yrdsPG     0.482
    pssng_yrds__PG     0.605
    pssng_frst_dPG     0.982
    avg_cmpltd_r_y     0.654
    avg_ntndd_r_yr     0.973
    aggressiveness     0.118
    mx_cmpltd_r_ds     0.257
    avg_air_distnc     0.776
    max_air_distnc     0.261
    avg_r_yrds_t_s     0.888
    passing_cpoe       0.981
    pass_comp_pct      0.858
    passer_rating      0.877
    cmpltn_prcnt__     0.948

The three-factor model with correlated residuals fit was acceptable according to CFI and TLI. The model fit was subpar for RMSEA and SRMR.

Code

lavaan::fitMeasures(
  cfa3factorModified_fit,
  fit.measures = c(
    "chisq", "df", "pvalue",
    "chisq.scaled", "df.scaled", "pvalue.scaled",
    "chisq.scaling.factor",
    "baseline.chisq","baseline.df","baseline.pvalue",
    "rmsea", "cfi", "tli", "srmr",
    "rmsea.robust", "cfi.robust", "tli.robust"))

               chisq                   df               pvalue 
            2225.771              113.000                0.000 
        chisq.scaled            df.scaled        pvalue.scaled 
            1169.232              113.000                0.000 
chisq.scaling.factor       baseline.chisq          baseline.df 
               1.904            42942.078              153.000 
     baseline.pvalue                rmsea                  cfi 
               0.000                0.097                0.951 
                 tli                 srmr         rmsea.robust 
               0.933                0.322                0.317 
          cfi.robust           tli.robust 
               0.797                0.725

Code

lavaan::residuals(
  cfa3factorModified_fit,
  type = "cor")

$type
[1] "cor.bollen"

$cov
                                        cmplPG attmPG pssng_yPG pssng_tPG
completionsPerGame                       0.000                           
attemptsPerGame                         -0.001  0.000                    
passing_yardsPerGame                     0.001  0.002     0.000          
passing_tdsPerGame                      -0.001 -0.001     0.001     0.000
passing_air_yardsPerGame                 0.013  0.011     0.000     0.006
passing_yards_after_catchPerGame        -0.011 -0.007     0.002    -0.007
passing_first_downsPerGame               0.001  0.001     0.001     0.007
avg_completed_air_yards                 -0.299 -0.320    -0.304    -0.192
avg_intended_air_yards                  -0.369 -0.389    -0.352    -0.281
aggressiveness                          -0.245 -0.228    -0.242    -0.211
max_completed_air_distance               0.133  0.111     0.136     0.197
avg_air_distance                        -0.384 -0.401    -0.367    -0.296
max_air_distance                         0.044  0.022     0.049     0.064
avg_air_yards_to_sticks                 -0.337 -0.359    -0.318    -0.241
passing_cpoe                            -0.108 -0.192    -0.117    -0.055
pass_comp_pct                           -0.015 -0.081    -0.017     0.022
passer_rating                            0.058 -0.027     0.080     0.139
completion_percentage_above_expectation -0.139 -0.224    -0.152    -0.092
                                        pssng_r_PG p___PG pssng_f_PG avg_c__
completionsPerGame                                                          
attemptsPerGame                                                             
passing_yardsPerGame                                                        
passing_tdsPerGame                                                          
passing_air_yardsPerGame                     0.000                          
passing_yards_after_catchPerGame             0.004  0.000                   
passing_first_downsPerGame                  -0.002  0.003      0.000        
avg_completed_air_yards                      0.428 -0.780     -0.269   0.000
avg_intended_air_yards                       0.407 -0.843     -0.351   0.155
aggressiveness                               0.243 -0.536     -0.228   0.354
max_completed_air_distance                   0.534 -0.262      0.140   0.108
avg_air_distance                             0.366 -0.824     -0.372   0.262
max_air_distance                             0.584 -0.421      0.037   0.355
avg_air_yards_to_sticks                      0.412 -0.802     -0.313   0.183
passing_cpoe                                 0.160 -0.351     -0.107   0.208
pass_comp_pct                               -0.028 -0.053     -0.005  -0.201
passer_rating                                0.086 -0.027      0.085  -0.025
completion_percentage_above_expectation      0.204 -0.441     -0.140   0.320
                                        avg_n__ aggrss mx_c__ avg_r_ mx_r_d
completionsPerGame                                                         
attemptsPerGame                                                            
passing_yardsPerGame                                                       
passing_tdsPerGame                                                         
passing_air_yardsPerGame                                                   
passing_yards_after_catchPerGame                                           
passing_first_downsPerGame                                                 
avg_completed_air_yards                                                    
avg_intended_air_yards                    0.000                            
aggressiveness                            0.325  0.000                     
max_completed_air_distance                0.182  0.178  0.000              
avg_air_distance                          0.105  0.348  0.183  0.000       
max_air_distance                          0.269  0.270  0.124  0.316  0.000
avg_air_yards_to_sticks                   0.056  0.337  0.160  0.131  0.280
passing_cpoe                              0.175 -0.097  0.354  0.159  0.293
pass_comp_pct                            -0.252 -0.398  0.076 -0.272 -0.023
passer_rating                            -0.097 -0.294  0.261 -0.114  0.109
completion_percentage_above_expectation   0.284 -0.044  0.406  0.267  0.368
                                        av____ pssng_ pss_c_ pssr_r cmp___
completionsPerGame                                                        
attemptsPerGame                                                           
passing_yardsPerGame                                                      
passing_tdsPerGame                                                        
passing_air_yardsPerGame                                                  
passing_yards_after_catchPerGame                                          
passing_first_downsPerGame                                                
avg_completed_air_yards                                                   
avg_intended_air_yards                                                    
aggressiveness                                                            
max_completed_air_distance                                                
avg_air_distance                                                          
max_air_distance                                                          
avg_air_yards_to_sticks                  0.000                            
passing_cpoe                             0.184  0.000                     
pass_comp_pct                           -0.244 -0.054  0.000              
passer_rating                           -0.079 -0.076  0.017  0.000       
completion_percentage_above_expectation  0.289 -0.002 -0.108 -0.120  0.000

$mean
                     completionsPerGame                         attemptsPerGame 
                                  0.000                                   0.000 
                   passing_yardsPerGame                      passing_tdsPerGame 
                                  0.000                                   0.000 
               passing_air_yardsPerGame        passing_yards_after_catchPerGame 
                                  0.000                                   0.000 
             passing_first_downsPerGame                 avg_completed_air_yards 
                                  0.000                                  -0.313 
                 avg_intended_air_yards                          aggressiveness 
                                 -0.338                                  -0.112 
             max_completed_air_distance                        avg_air_distance 
                                 -0.360                                  -0.291 
                       max_air_distance                 avg_air_yards_to_sticks 
                                 -0.453                                  -0.323 
                           passing_cpoe                           pass_comp_pct 
                                 -0.186                                   0.011 
                          passer_rating completion_percentage_above_expectation 
                                 -0.086                                  -0.217

Code

lavaan::modificationindices(
  cfa3factorModified_fit,
  sort. = TRUE)

Below are factor scores from the model for the first six players:

Code

cfa3factorModified_factorScores <- lavaan::lavPredict(cfa3factorModified_fit)

head(cfa3factorModified_factorScores)

              F1          F2         F3
[1,]  0.15200494  0.05363160  2.0011759
[2,]  0.15623855  0.03109657 -0.2720815
[3,]  0.23960355  0.04848608 -0.3399933
[4,]  0.10322631  0.03627123  1.3444663
[5,]  0.97291397  0.21575517  0.4490802
[6,] -0.01517817 -0.01285389 -0.9266280

Code

dataForFA_cfa <- cbind(dataForFA, cfa3factorModified_factorScores)

The path diagram of the modified three-factor CFA model with correlated residuals is in Figure 22.23.

Code

lavaanPlot::lavaanPlot(
  cfa3factorModified_fit,
  coefs = TRUE,
  #covs = TRUE,
  stand = TRUE)

Figure 22.23: Path Diagram of the Modified Three-Factor Confirmatory Factor Analysis Model With Correlated Residuals.

Code

lavaangui::plot_lavaan(cfa3factorModified_fit)

Code

anova(
  cfa3factorModified_fit,
  cfa3factor_fit
)

Here are the variables that loaded onto each of the factors:

Code

factor1vars <- c(
  "completionsPerGame","attemptsPerGame","passing_yardsPerGame","passing_tdsPerGame",
  "passing_air_yardsPerGame","passing_yards_after_catchPerGame","passing_first_downsPerGame")

factor2vars <- c(
  "avg_completed_air_yards","avg_intended_air_yards","aggressiveness","max_completed_air_distance",
  "avg_air_distance","max_air_distance","avg_air_yards_to_sticks")

factor3vars <- c(
  "passing_cpoe","pass_comp_pct","passer_rating","completion_percentage_above_expectation")

Here are the players and seasons that showed the highest levels of Quarterback “Usage”:

Code

dataForFA_cfa %>% 
  arrange(-F1) %>% 
  select(player_display_name, season, F1, all_of(factor1vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Usage”:

Code

dataForFA_cfa %>% 
  arrange(F1) %>% 
  select(player_display_name, season, F1, all_of(factor1vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the highest levels of Quarterback “Aggressiveness”:

Code

dataForFA_cfa %>% 
  arrange(-F2) %>% 
  select(player_display_name, season, F2, all_of(factor2vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Aggressiveness”:

Code

dataForFA_cfa %>% 
  arrange(F2) %>% 
  select(player_display_name, season, F2, all_of(factor2vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the highest levels of Quarterback “Performance”:

Code

dataForFA_cfa %>% 
  arrange(-F3) %>% 
  select(player_display_name, season, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

If we restrict it to Quarterbacks who played at least 10 games in the season, here are the players and seasons that showed the highest levels of Quarterback “Performance”:

Code

dataForFA_cfa %>% 
  arrange(-F3) %>% 
  filter(games >= 10) %>% 
  select(player_display_name, season, games, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

Here are the players and seasons that showed the lowest levels of Quarterback “Performance”:

Code

dataForFA_cfa %>% 
  arrange(F3) %>% 
  select(player_display_name, season, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

If we restrict it to Quarterbacks who played at least 10 games in the season, here are the players and seasons that showed the lowest levels of Quarterback “Performance”:

Code

dataForFA_cfa %>% 
  arrange(F3) %>% 
  filter(games >= 10) %>% 
  select(player_display_name, season, games, F3, all_of(factor3vars)) %>% 
  na.omit() %>% 
  head()

22.8 Conclusion

Factor analysis is a class of latent variable models that aims to identify the optimal, parsimonious latent structure for a group of variables. Latent variables are ways of studying and operationalizing theoretical constructs that cannot be directly observed or quantified. Factor analysis encompasses two general types: confirmatory factor analysis and exploratory factor analysis. Exploratory factor analysis (EFA) is used when the researcher has no a priori hypotheses about how a set of variables is structured. Confirmatory factor analysis (CFA) is used when a researcher wants to evaluate how well a hypothesized model fits. A goal of factor analysis is to balance accuracy (i.e., variance accounted for) and parsimony (i.e., simplicity). Factor analysis estimates the latent factors as the common variance among the variables that load onto that factor and discards the remaining variance as “error”. There are many decisions to make in factor analysis. These decisions can have important impacts on the resulting solution. Thus, it can be helpful for theory and interpretability to help guide decision-making when conducting factor analysis. Using both exploratory and confirmatory factor analysis, we were able to identify three latent factors that accounted for considerable variance in the variables we examined, pertaining to Quarterbacks: 1) usage; 2) aggressiveness; 3) performance. We were then able to determine which players were highest and lowest on each of these factors.

22.9 Session Info

Code

sessionInfo()

R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.4  forcats_1.0.1    stringr_1.5.2    dplyr_1.1.4     
 [5] purrr_1.1.0      readr_2.1.5      tidyr_1.3.1      tibble_3.3.0    
 [9] ggplot2_4.0.0    tidyverse_2.0.0  lavaanPlot_0.8.1 lavaan_0.6-20   
[13] nFactors_2.4.1.2 psych_2.5.6     

loaded via a namespace (and not attached):
 [1] GPArotation_2025.3-1 DiagrammeR_1.0.11    generics_0.1.4      
 [4] stringi_1.8.7        lattice_0.22-7       hms_1.1.4           
 [7] digest_0.6.37        magrittr_2.0.4       evaluate_1.0.5      
[10] grid_4.5.1           timechange_0.3.0     RColorBrewer_1.1-3  
[13] fastmap_1.2.0        jsonlite_2.0.0       scales_1.4.0        
[16] pbivnorm_0.6.0       numDeriv_2016.8-1.1  mnormt_2.1.1        
[19] cli_3.6.5            rlang_1.1.6          visNetwork_2.1.4    
[22] withr_3.0.2          yaml_2.3.10          tools_4.5.1         
[25] parallel_4.5.1       tzdb_0.5.0           vctrs_0.6.5         
[28] R6_2.6.1             stats4_4.5.1         lifecycle_1.0.4     
[31] htmlwidgets_1.6.4    MASS_7.3-65          pkgconfig_2.0.3     
[34] pillar_1.11.1        gtable_0.3.6         glue_1.8.0          
[37] xfun_0.53            tidyselect_1.2.1     rstudioapi_0.17.1   
[40] knitr_1.50           farver_2.1.2         htmltools_0.5.8.1   
[43] nlme_3.1-168         rmarkdown_2.30       compiler_4.5.1      
[46] quadprog_1.5-8       S7_0.2.0

Bollen, K. A. (1989). Structural equations with latent variables. John Wiley & Sons.

Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53(1), 605–634. https://doi.org/10.1146/annurev.psych.53.100901.135239

Box, G. E. P. (1979). Robustness in the strategy of scientific model building. In R. L. Launer & G. N. Wilkinson (Eds.), Robustness in statistics. Academic Press.

Corston, R., & Colman, A. M. (2000). A crash course in SPSS for Windows. Wiley-Blackwell.

Dinno, A. (2014). Gently clarifying the application of Horn’s parallel analysis to principal component analysis versus factor analysis. http://archives.pdx.edu/ds/psu/10527

Karch, J. D. (2025a). lavaangui: A web-based graphical interface for specifying lavaan models by drawing path diagrams. Structural Equation Modeling: A Multidisciplinary Journal, 1–12. https://doi.org/10.1080/10705511.2024.2420678

Karch, J. D. (2025b). lavaangui: Graphical user interface with integrated diagrammer for lavaan. https://doi.org/10.32614/CRAN.package.lavaangui

Kline, R. B. (2023). Principles and practice of structural equation modeling (5th ed.). Guilford Publications.

Kline, R. B. (2024). How to evaluate local fit (residuals) in large structural equation models. International Journal of Psychology, 59(6), 1293–1306. https://doi.org/10.1002/ijop.13252

Lishinski, A. (2024). lavaanPlot: Path diagrams for lavaan models via DiagrammeR. https://doi.org/10.32614/CRAN.package.lavaanPlot

McNeish, D., & Wolf, M. G. (2023). Dynamic fit index cutoffs for confirmatory factor analysis models. Psychological Methods, 28(1), 61–88. https://doi.org/10.1037/met0000425

Petersen, I. T. (2024). Principles of psychological assessment: With applied examples in R. Chapman and Hall/CRC. https://doi.org/10.1201/9781003357421

Petersen, I. T. (2025). Principles of psychological assessment: With applied examples in R. University of Iowa Libraries. https://doi.org/10.25820/work.007199

Raiche, G., & Magis, D. (2025). nFactors: Parallel analysis and other non graphical solutions to the Cattell scree test. https://doi.org/10.32614/CRAN.package.nFactors

Revelle, W. (2025). psych: Procedures for psychological, psychometric, and personality research. https://doi.org/10.32614/CRAN.package.psych

Revelle, W., & Rocklin, T. (1979). Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14(4), 403–414. https://doi.org/10.1207/s15327906mbr1404_2

Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02

Rosseel, Y., Jorgensen, T. D., & De Wilde, L. (2024). lavaan: Latent variable analysis. https://doi.org/10.32614/CRAN.package.lavaan

Ruscio, J., & Roche, B. (2012). Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure. Psychological Assessment, 24(2), 282–292. https://doi.org/10.1037/a0025697

Strauss, M. E., & Smith, G. T. (2009). Construct validity: Advances in theory and methodology. Annual Review of Clinical Psychology, 5(1), 1–25. https://doi.org/10.1146/annurev.clinpsy.032408.153639

Wolf, M. G., & McNeish, D. (2022). dynamic: DFI cutoffs for latent variable models. https://doi.org/10.32614/CRAN.package.dynamic

22 Factor Analysis

22.1 Getting Started

22.1.1 Load Packages

22.1.2 Load Data

22.1.3 Prepare Data

22.1.3.1 Season-Averages

22.1.3.2 Merge Data

22.1.3.3 Specify Variables

22.1.3.4 Subset Data

22.1.3.5 Standardize Variables

22.2 Overview of Factor Analysis

22.3 Factor Analysis and Structural Equation Modeling

22.4 Path Diagrams

22.5 Decisions in Factor Analysis

22.5.1 1. Variables to Include and their Scaling

22.5.2 2. Method of Factor Extraction

22.5.2.1 Exploratory Factor Analysis

22.5.2.2 Confirmatory Factor Analysis

22.5.3 3. Factor Rotation

22.5.4 4. Determining the Number of Factors to Retain

22.5.4.1 Model Fit Indices

22.5.5 5. Interpreting and Using Latent Factors

22.6 Example of Exploratory Factor Analysis

22.7 Example of Confirmatory Factor Analysis

22.8 Conclusion

22.9 Session Info

Feedback

Email Notification