I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.
Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Principles-Psychological-Assessment
Adding an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.
Chapter 11 General Issues in Clinical Assessment
11.1 Historical Perspectives on Clinical Assessment
Historically, assessment has played a key role in clinical psychology. A history of clinical psychology is provided by Benjamin (2005). Many would consider assessment—especially the assessment of intelligence—the beginning of clinical psychology, and the role that distinguished clinical psychology from other fields. Intelligence testing has been one of the real successes of clinical psychology and assessment. For example, several early measures were developed to assess intelligence, including James Cattell’s measure and the Binet-Simon Intelligence scale. Lewis Terman at Stanford extended the Binet-Simon Intelligence scale to study gifted people, which became known as the Stanford-Binet Intelligence Scales.
During World War I, instruments were used to select people for various military occupations (e.g., pilot versus submariner). Intellectual assessment, including the Army Alpha and Beta Tests, was used to decide who was psychologically unfit to serve. Then came the development of personality assessments, such as assessment of “shell shock”, which would eventually become known as post-traumatic stress disorder (PTSD). The military conducted personality assessments to identify people who would be susceptible to shell shock, which would be labeled today as a measure of “neuroticism”.
Intellectual assessment and personality assessment became the two principal tools for clinical psychologists. Initially, psychologists were primarily psychometricians. They developed assessment devices and helped score and interpret them. Early on, clinical psychologists did not play a large role in making diagnoses or in treatment—instead, psychiatrists did. However, that changed after World War I. Between World War I and World War II, assessment was the dominant role of practicing clinical psychologists, especially projective personality testing—such as word-association tests, the Rorschach Inkblot Test, and the Thematic Apperception Test—and more objective personality testing. The Rorschach (1921) gave clinical psychologists a “voice at the table” with psychiatrists, even though doubts arose about its accuracy. Projective testing dominated personality assessment through the 1940s. The Minnesota Multiphasic Personality Inventory (MMPI), which became a popular objective personality test, was published in 1943.
During World War II, many of the modern psychology departments, clinical psychology programs, and VA-funded clinical psychology internships were created, not by the psychology community, but by the federal government to address wartime and post-wartime needs. Likewise, many measures were developed as a need to meet wartime needs. One of the origins of the psychometric movement in psychology was spurred by historical pressures to create tests to classify people during wartime, such as with the Army Alpha and Beta Tests.
11.2 Contemporary Trends
There has been a general decline in the frequency with which clinical assessment (and psychotherapy) is conducted by clinical psychologists. Assessment and measurement are crucial to research but are declining in treatment. This is in great part due to managed care, in which a primary goal is cost containment of mental health services. Managed care has resulted in greatly reduced patient access to mental health services, a dramatic reduction in insurance funds for reimbursement of services, more clinician time spent on paperwork and less time seeing clients, having less time for assessment because fewer sessions are reimbursed, tendencies to administer less frequent or less comprehensive assessments, and increased provision of psychological services from master’s degree-level providers because they are less expensive than doctoral providers.
Nevertheless, there are contexts in which clinical assessment is more common. There is still considerable assessment in the Department of Veterans Affairs (VA) system, including testing and integrated reports. In addition to the VA, there is still considerable neuropsychological and aptitude testing. What is the future of assessment for treatment? My hope is that it will be evidence-based assessment (along with evidence-based treatment).
11.3 Terminology
Several terms are relevant for clinical assessment. Prevalence is the proportion of the population that has the condition at a given point in time, whereas incidence is the rate of occurrence of new cases. Incidence indicates the risk of developing or contracting the condition, whereas prevalence indicates how widespread the condition is. Point prevalence is the proportion of the population that has the condition at a single point in time. Lifetime prevalence is the proportion of the population that will ever have a condition at any point in their lifetime.
The positivity rate is the proportion of tests that are positive (the marginal probability or base rate of positive tests; i.e., the selection ratio). The case fatality rate is the proportion of people with a condition who die as a result of that condition.
11.3.1 The Three S’s: Signs, Symptoms, and Syndromes
The Three S’s of assessment in clinical psychology are signs, symptoms, and syndromes. Signs are observable features (or manifestations) of a disorder (Lilienfeld et al., 2015). Therefore, signs can be perceived by a clinician. By contrast, symptoms are unobservable manifestations of a disorder that can only be perceived by the client (Lilienfeld et al., 2015). The term “sign” has a connotation of being somewhat more objective than symptom because a symptom is based on the client’s perceptions, but at least in clinical psychology (and also in many medical fields), both signs and symptoms have considerable elements of subjectivity. A syndrome is a collection of signs and symptoms that co-occur and may reflect a particular disorder.
11.4 Errors of Pseudo-Prediction
There are many errors of pseudo-prediction that are relevant to clinical assessment. One error of pseudo-prediction is the confusion of inverse probabilities, as described in Section 9.1.1.2. Another error of pseudo-prediction involves, when predicting a low base-rate phenomenon, only capitalizing on chance occurrences.
Another form of pseudo-prediction is constructing predictions that seem compelling but that have no basis, for instance using the date and place of birth to predict one’s personality. Several techniques are used by people to make their predictions seem more compelling: make it true, make it positive, make it ambiguous, and make it mysterious or intimidating. Making it “true” means surrounding predictions with true statements, and using Barnum statements. Barnum statements are statements that are true for most people, such as: “You have a tendency to be critical of yourself.” Including, in your prediction, outcomes that are a common occurrence for 80% of respondents (blanket statements) may make it more likely that people believe the prediction. Most of the statements are correct, so people tend to think the whole thing is correct. Making it “positive” means making the statement positive about the person. People tend to believe positive things about the self. However, if the statement is about a third person, people tend to believe it more if the statement is partially negative. Making it “ambiguous” means using vague language such as “it has not been unusual,” “sometimes,” “at times,” “you tend to be,” etc. Making it mysterious or intimidating could involve making the prediction from some organization or process with authority. An example of making statement mysterious is the PaTE report (“Psychodynamic and Therapeutic Evaluation”) (Kriegman & Kriegman, 1965).
Unsupported fad treatments are another form of pseudo-prediction. There is unfortunately a large gap between science and practice. Many of the treatments known to work are not commonly implemented in practice; instead, many treatments implemented in practice have been shown not to be helpful (compared to placebo), or have been shown to be harmful (Lilienfeld, 2007), or have insufficient evidence supporting their use. Selecting a given treatment is a prediction by the therapist that the treatment will lead to the most positive outcomes for the client than the alternatives. There are likely a multitude of reasons for the disparity between practice and research. For instance, some practitioners feel that research is useless. Other practitioners are unaware that there are more effective treatments. Others may not have the time or money to learn new treatments on top of their already-busy schedules. Yet other therapists are greedy. In other cases, it is difficult to determine which treatments work due to a variety of effects.
For instance, some ineffective treatments may appear to “work” because of placebo effects, whereby merely receiving a treatment and expecting to improve yields positive benefits—not because of the specific treatment per se. In addition to client expectancy effects, there can also be therapist expectancy effects. There can also be demand effects, where participants form an interpretation of the study’s purpose and subconsciously change their behavior to fit that interpretation. In addition, ineffective treatments can appear to work due to maturational effects, especially with children and long treatments. Ineffective treatments can appear to work due to differential dropout or attrition, where the people who are improving are the most likely to stay in treatment.
In addition, there are regression effects that can make an ineffective treatment appear to work. As a reminder, the true score formula in classical test theory is: \(X = T + e\). There are random fluctuations in an observed score; part of the observed score is stable true score and part of it is random error. As an example, consider that we will use Norwegian acupuncture. We will select clients who are at the extremes in terms of high pain. They will ultimately converge or regress to their mean (i.e., their true score). So, on average, their pain will be lower the next time. In therapy, clients tend to come in at their worst, so they are bound to improve. The clients would have tended (on average) to get better, even without treatment. This is known as regression to the mean. For these reasons, it is important to include a control group to compare how clients would have done had they received no treatment, a placebo, or some other treatment. For example, in experiments that test the efficacy of cognitive behavioral therapy (CBT), it is important to include a control group, such as no treatment control, waitlist control, treatment as usual, or best available treatment.
Violating the extension rule (the conjunction fallacy) is another form of pseudo-prediction. The extension rule states that a special outcome cannot be more probable than a general one of which it is part. But in spite of this, we tend to assign high probabilities to prototypical combinations of characteristics, events, or sequences, and irrationally low probabilities to atypical single characteristics, events, or sequences. Here’s an example from Dawes (1986):
“Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.” Which is more probable?
Linda is a bank teller.
Linda is a bank teller and is active in the feminist movement.”
The probability that a woman is a bank teller and feminist is less likely than the probability that she is a bank teller. However, people tend to say that B is more probable. People tend to make judgments based on the representativeness heuristic—how similar to the prototypes the cases are.
Another form of pseudo-prediction is imposing structure (non-randomness) on a pattern where there are none. In addition, expecting alternation, known as the “belief in the law of small numbers”, is a form of pseudo-prediction. Expecting alternation is believing that even small sequences should be representative of populations, even though that is false. For instance, it might involve expecting that one will get three heads if flipping a coin six times.
Spurious correlations are another form of pseudo-prediction. The expected value of \(R^2\) is \(\frac{K}{n - 1}\), where \(K\) is the number of predictors. The more predictors in the model, the more variance is accounted for in the outcome. Thus, with many predictors and a small sample, there is pseudo-prediction, as described in Section 9.9. However, as the sample size increases, spurious correlations decrease.
Multicollinearity is also a form of pseudo-prediction. As the correlation among the predictors increase, the chance of getting an arbitrary answer increases.
Another form of pseudo-prediction is the hot hand fallacy. The hot hand fallacy in basketball is believing that making a shot makes it more likely that you will make the next shot. Most research suggests that there is no such thing as a “hot hand” (Avugos et al., 2013; Bar-Eli et al., 2006; Gilovich et al., 1985). Some recent research, however, has suggested that there may be a small hot hand effect in some contexts (Bocskocsky et al., 2014; J. B. Miller & Sanjurjo, 2014).
Capitalizing on regression effects is pseudo-prediction. When underestimating regression effects, a person can capitalize on regression effects by predicting an extremity in one variable (e.g., amount of improvement over time) due to an extremity in another variable (e.g., pain severity), when in fact, regression to the mean is most likely.
11.5 Conclusion
Many of the origins of assessment came from attempts to address wartime needs. Since the 1980s, there has been a general decline in the frequency with which clinical assessments (and psychotherapy) are conducted by clinical psychologists, due to managed care and cost containment. Many errors of pseudo-prediction are relevant to clinical assessment.
11.6 Suggested Readings
Wood et al. (2002)