I need your help!

I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.

Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Principles-Psychological-Assessment

Adding an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.

25 Cultural and Individual Diversity

Like ethics, considerations of cultural and individual diversity are relevant to all of our domains as psychologists, including (but not limited to) research, teaching, assessment, intervention, and supervision. As a result, we have discussed important considerations about diversity throughout the book, including the chapters on validity, evidence-based assessment, and test bias. At the same time, given the importance of cultural and individual diversity, we also provide this section, which is devoted more fully to these issues within the assessment context. Diverse samples provide stronger tests of theories and better generalizability of findings (external validity). On the assessment side, we need measures that provide us with the best information, and that are reliable and valid for all people we use them with. Diversity is a broad concept, and it spans many dimensions of differences among people.

25.1 Terminology

25.1.1 Diversity, Equity, and Inclusion (DEI)

25.1.1.1 Diversity

Diversity refers to all aspects of differences between people. Diversity is a characteristic of a group, not a characteristic of an individual. Thus, a person is not diverse. There are many aspects of human difference, including but not limited to race, ethnicity, creed, color, sex, gender, gender identity, sexual orientation, socioeconomic status, language, culture, national origin, religion/spirituality, age, (dis)ability, military/veteran status, personality, political perspective, and associational preferences.

Having a diverse group (e.g., of participants, clients, patients, etc.) is insufficient. It is also important to make sure that we take steps for all groups to feel included (inclusion) and to thrive (equity).

25.1.1.2 Equity

Equity refers to fair and just practices and policies that ensure everyone can thrive. Equity is different from equality. Equality involves treating everyone equally (i.e., treating everyone the same). However, due to structural inequities—historic and current (e.g., structural racism and patriarchy)—some people are disadvantaged and marginalized more so than others. Structural racism refers to “the normalization and legitimization of an array of dynamics—historical, cultural, institutional, and interpersonal—that routinely advantage non-Latine White persons while producing cumulative and chronic adverse outcomes for Black and other people with minoritized identities in housing, education, employment, health care, criminal justice, and psychology.” (Byrd et al., 2021, p. 279). Patriarchal societies reflect society’s long history of wage gaps, gender discrimination, violence against women, etc. Treating people equally (equality) does not allow everyone to thrive given these different experiences and unequal access to opportunities. Thus, equity seeks to help everyone thrive, including those who are marginalized with fewer access to opportunities.

25.1.1.3 Inclusion

Inclusion refers to providing a community where all members are and feel respected, have a sense of belonging, and are able to participate and achieve their potential.

25.1.2 Aspects of Difference

There are many aspects of difference. However, identifying labels to capture these differences can be challenging. “Asian” is an overly broad term. There are more than 4 billion people from Asian countries, and considerable heterogeneity. “Hispanic” is based on language (i.e., a person from a Spanish speaking country), whereas “Latino/a/x/e” and “African American” are based on geography (i.e., a person from a Latin American country). Traditional classifications of race/ethnicity by the National Institutes of Health miss groups that may be important to distinguish, including Arab, Middle Eastern, or North African (AMENA) populations.

Labels by the 2020 U.S. Census attempt to be more precise than traditional classifications of race and ethnicity by the National Institutes of Health. For the classification of “Hispanic, Latino, or Spanish origin” in the 2020 U.S. Census, see Figure 25.1. Notice how the question distinguishes between identities of people from (or whose ancestors were from) various countries, including Mexico, Puerto Rico, Cuba, etc.

Question Asking About One's Hispanic Origin in the 2020 U.S. Census. (Figure reprinted from the American Community Survey (2020): [https://censusreporter.org/topics/race-hispanic/](https://censusreporter.org/topics/race-hispanic/) [archived at <https://perma.cc/LRW6-4JAJ>]). — Figure 25.1: Question Asking About One’s Hispanic Origin in the 2020 U.S. Census. (Figure reprinted from the American Community Survey (2020): https://censusreporter.org/topics/race-hispanic/ [archived at https://perma.cc/LRW6-4JAJ])

For the classification of “Race” in the 2020 U.S. Census, see Figure 25.2. Notice how the question distinguishes between identities of people from (or whose ancestors were from) various countries, including India, China, Philippines, etc.

Question Asking About One's Race in the 2020 U.S. Census. (Figure reprinted from the American Community Survey (2020): [https://censusreporter.org/topics/race-hispanic/](https://censusreporter.org/topics/race-hispanic/) [archived at <https://perma.cc/LRW6-4JAJ>]) — Figure 25.2: Question Asking About One’s Race in the 2020 U.S. Census. (Figure reprinted from the American Community Survey (2020): https://censusreporter.org/topics/race-hispanic/ [archived at https://perma.cc/LRW6-4JAJ])

Society treats people differently based on factors that they are born into and are outside of their control, including their race and their family’s socioeconomic status. Socioeconomic status refers to the social standing or class of an individual, and it is often assessed as a combination of education, income, occupation, access to resources, privilege, power, and control (Suzuki et al., 2013). There is also growing attention to neighborhood- and community-level factors associated with deprivation.

There are also many other important aspects of difference to assess when thinking about individual differences in behavior. For instance, one important aspect of difference is family structure. Family structure includes lots of components, such as whether the parent is a single parent or whether there are two parents, whether the parents are working parents or whether the family has a stay-at-home parent, and whether there are other family members (e.g., grandparents) in the household.

Another important aspect of diversity includes one’s sex, gender identity, and gender expression. Sex is the person’s gender that was assigned at birth, and includes male, female, and intersex. Gender identity is how one self-identifies in terms of their gender. Gender identity is a spectrum in terms of the degree to which a person identifies as a female/woman/girl, a male/man/boy, or other genders, in terms of personality, likes/dislikes, jobs/hobbies, and roles and expectations. The person’s gender identity may be non-binary or genderqueer if they do not identify as a man or woman (e.g., genderfluid). When the person’s sex assigned at birth and their gender identity are aligned, the person is cis-gender. When the person’s sex assigned at birth and their gender identity are not aligned, the person is transgender. Gender expression deals with how the person’s gender is expressed to others, including their appearance.

Another dimension of diversity is whether someone is from a rural versus urban area. People from rural areas may have less access to resources. Another aspect of diversity is refugee or immigrant status. Refugee and immigrant groups may have less access to resources and may have language barriers, in addition to difficulty finding work, and fear of deportation.

It is also important to consider diversity in terms of sexual orientation and sexual behaviors. There is a distinction between one’s sexual orientation (e.g., lesbian, gay, bisexual, heterosexual, pansexual, asexual)—that is, who one is sexually or romantically attracted to—versus whom one has sex with (i.e., sexual behavior). There are also differences in terms of the degree of people’s sexual risk taking and sexual aggression.

Ability and disability are other important facets of diversity, including learning disabilities. Additionally, there are important brain-related differences that lead people to interact and experience the world in different ways, as reflected in the term, neurodiversity. Neurodiversity can include differences related to autism and other conditions such as attention-deficit hyperactivity disorder.

Age is another important dimension of diversity. Consider the issue of heterotypic continuity. For example, externalizing problems look different at different ages and should be assessed in different ways with different measures at different ages (F. R. Chen & Jaffee, 2015; Miller et al., 2009; Moffitt, 1993; Patterson, 1993; Petersen et al., 2015; Petersen & LeBeau, 2022; Wakschlag et al., 2010).

Another important aspect of diversity is culture, which includes belief systems, value orientations, psychological processes, worldview, learned and transmitted beliefs, and practices (Suzuki et al., 2013). A person’s culture has many facets, including but not limited to experiences related to geographic boundaries, language, religious belief, social class, gender, sexual orientation, and ability status (Suzuki et al., 2013). However, culture and one’s related identities are dynamic and changing (Suzuki et al., 2013), which makes them challenging to assess. Culture tends to be ignored in psychological research relative to other facets of diversity, but culture is arguably the most important in terms of influencing behavior. Additional dimensions of difference include religion and spirituality.

Another aspect of diversity is intersectionality, the intersecting of multiple identities. Intersectionality deals with how the combination of various identities can combine to influence behavior and bias.

There are many other important aspects of difference, too. This was not an exhaustive list!

25.2 Assessing Cultural and Individual Diversity: Multicultural Assessment Frameworks

A review of frameworks for multicultural assessment is provided by Edwards et al. (2017).

25.2.1 The ADDRESSING Model

The ADDRESSING model (Hays, 2016) provides a framework to consider some of the commonly examined differences between people within a clinical context. ADDRESSING is an acronym that stands for:

Age and generational differences
Developmental or other
Disability
Religion and spiritual orientation
Ethnic and racial identity
Socioeconomic status
Sexual orientation
Indigenous heritage
National origin
Gender

These are important aspects of differences between people that can be helpful to understand when working with a client and how they self-identify. For instance, the therapist could ask the client how they self-identify in each of these domains and how their identity is meaningful to them. In addition to the aspects of diversity identified in ADDRESSING, there are other aspects of diversity and difference that may be worth considering, including but not limited to personality, culture, and political beliefs.

25.2.2 DSM-5 Cultural Formulation Interview

The Diagnostic and Statistical Manual of Mental Disorders (DSM) provides a framework for clinicians to organize cultural information about a client using the Cultural Formulation Interview (Lewis-Fernández et al., 2014). The Cultural Formulation Interview is a semi-structured interview and is available here: https://www.psychiatry.org/File%20Library/Psychiatrists/Practice/DSM/APA_DSM5_Cultural-Formulation-Interview.pdf (archived at https://perma.cc/3EWR-LGEN)

25.2.3 Multicultural Assessment Procedure

The Multicultural Assessment Procedure was developed by Ridley et al. (1998). The framework includes four phases:

Gather clinical data, including cultural data, through history taking and multiple methods for the purpose of formulating a case conceptualization
Interpret clinical and cultural data to formulate a hypothesis, while keeping in mind to:
- differentiate the cultural data as being either idiosyncratic (i.e., unique to the individual client and would not necessarily be expected of other members of the client’s culture) or culture-specific
- consider base rates
- differentiate dispositional from environmental stressors
- differentiate clinically significant data from data that are not clinically significant
Incorporate cultural data with other clinical information to test the hypotheses by ruling out medical explanations and using appropriate assessments and testing
Arrive at a sound assessment decision (case conceptualization)

The procedure also includes debiasing strategies to minimize the likelihood of clinical judgment errors. Ridley et al. (2001) expanded on this framework in consideration of ethical issues in multicultural assessment.

25.2.4 Multicultural Assessment–Intervention Process

The Multicultural Assessment–Intervention Process was developed by Dana (1998). The process provides a flowchart for clinicians to conduct assessment by asking relevant questions about cultural orientation, type of assessment instrument (e.g., etic or emic), cultural formulation for diagnosis, and intervention approach (e.g., universal, culture-general, culture-specific, identity-specific, etc.). Etic assessments are culture-general, whereas emic assessments are culture-specific.

25.3 Assessments with Ethnic, Linguistic, and Culturally Diverse Populations

25.3.1 Methodological and Conceptual Challenges in Research

Methodological and conceptual challenges for conducting research with ethnic minorities are described by Okazaki & Sue (1995). Guidelines for conducting research with diverse groups are described by Burlew et al. (2019).

25.3.1.1 Use of Terms: Race, Ethnicity, Culture

Use of terminology with respect to race, ethnicity, and culture is a key challenge in assessment of diverse samples. Many people refer to race, ethnicity, and culture interchangeably, which reflects conceptual confusion. There is no definition of race, ethnicity, and culture that is universally agreed on (Okazaki & Sue, 1995). As described by Okazaki & Sue (1995), “the use of the term ‘race’ appears to imply biological factors, as races are typically defined by observable physiognomic features such as skin color, hair type and color, eye color, stature, facial features, and so forth” (p. 367). By contrast, ethnicity (ethnic status) has been defined as “an easily identifiable characteristic that implies a common cultural history with others possessing the same characteristic” (Eaton, 1980, p. 160). Ethnic identifiers for evaluating whether people share a “common cultural history” include racial, national, tribal, religious, linguistic, or cultural origin or background. Thus, race and ethnicity, though conceptually distinct, are related as race can be an ethnic identifier.

Researchers have argued that race designations are arbitrary (Helms et al., 2005), that race is a social, cultural, and/or political construct rather than a biological construct (Sternberg et al., 2005; Yudell et al., 2016), that race has no genetic basis (Yudell et al., 2016), that within-race differences are greater than between-race differences (Zuckerman, 1990), and that racial categories lack conceptual meaning (Helms et al., 2005). According to the American Anthropological Association (1998, p. 712), “physical variations in the human species have no meaning except the social ones that humans put on them”. However, even though race does not appear to have a genetic or biological basis and therefore appears to be a social construct rather than a biological construct, racism as a social problem is real and is a major issue (Smedley & Smedley, 2005). People are born into different social reactions based on their race. Thus, race is important to consider from a perspective of racialized experiences and discrimination even if the racial categories do not represent underlying biological differences.

In terms of determining what to assess and analyze, it is important for the researcher to determine whether they are interested in evaluating race as a biological variable, ethnicity as a demographic variable, or an aspect of cultural experience as a psychological variable (Okazaki & Sue, 1995). Researchers often use race and/or ethnicity as proxy variables for psychological variables such as cultural values, self-concept, minority status, etc. However, race and ethnicity are distal to the variables that may be of most psychological interest. Some people identify their primary culture as being different from their self-identified ethnicity. It is thus important to assess the psychological processes of interest, rather than merely relying on racial or ethnic categories as proxies.

Culture refers to a psychological variable: i.e., the social context. Researchers often group people together based on race or ethnicity with the implicit assumption that they share some cultural experience. According to Okazaki and Sue (1995), assumptions underlying the use of race or ethnicity in a study should be made explicit. Research should directly assess the underlying psychological variables associated with culture, such as different spiritual practices, that are hypothesized to produce the racial/ethnic/cultural differences. That is, it is generally best not to rely solely on race or ethnicity, which are imprecise, to classify people unless the goal is to classify people based on racialized reactions and experiences. Thus, the researcher can classify people according to race/ethnicity, but can also examine the more proximal cultural variables that are thought to drive group-related differences.

25.3.1.2 Whether to Examine Individual Differences or Group Characteristics

One question the researcher should consider is whether they are interested in understanding individual differences or group characteristics. Both can be valuable and important questions to examine. However, researchers should not under-estimate within-group heterogeneity (Okazaki & Sue, 1995). The greater the within-group heterogeneity, the less accurate predictions tend to be. Moreover, inferences at the group-level cannot necessarily be applied to the individual level; the confounding of an individual with the individual’s culture leads to stereotyping (Okazaki & Sue, 1995), which should be avoided.

25.3.1.3 Selecting Participants

One question is which groups to include in the research design and in what proportion. For group-related comparisons, the researcher would need a large enough sample size for every group examined to have adequate power to identify group-related differences. If the researcher is interested in studying a particular group, a question is whether to use a comparison group. A comparison group can provide a basis of comparison to better interpret the findings in the group of interest. However, inclusion of a comparison group can add time, cost, and as a result may lead to achieving a smaller sample in the group of interest. Moreover, the group comparison approach has been criticized for reinforcing racial stereotypes, reinforcing Whites as the standard group and non-White behavior as deviant, and overlooking within-group variation (Okazaki & Sue, 1995). Thus, decisions about which groups to include, in what proportion, and whether to include a comparison group should be guided by the questions and purpose of the study. Practical considerations can come into play, and having a comparison group should not be considered the necessary default for all studies. Moreover, White groups should not be considered the default.

25.3.1.4 Ensuring Fair Group Comparisons

Another issue is how to achieve fair group comparisons. One approach is to match groups by selecting participants a priori to be similar on relevant, secondary characteristics such as demographic characteristics (e.g., age, sex, socioeconomic status) or other abilities (e.g., intelligence). The goal of matching is for participants to be as similar as possible in the relevant characteristics apart from their group classification. However, there are challenges to matching groups on all relevant characteristics. Another approach is to control for these secondary characteristics post hoc in the statistical analysis. Which variables to include as control variables is an important question that depends on the goal. A key question is which variables to match participants on. However, there is not an agreed-upon list of matching variables. Generally, researchers recommend controlling for social and demographic characteristics such as educational attainment, income level, and language fluency, when group-related differences exist on those variables and the researcher believes that such differences may moderate the associations of interest (Okazaki & Sue, 1995). When comparing Black and White participants, it is important to control for group differences in socioeconomic status, especially in the U.S.

In addition to main effects of group status, it may also be important to consider interaction effects of group status, consistent with an intersectionality approach. However, power to detect interactions tends to be weaker than power to detect main effects, because of lower reliability of the product term, smaller effect sizes (archived at https://perma.cc/E3KA-SGB7), and smaller sample sizes of the intersecting groups.

25.3.1.5 Sampling

Researchers face important questions with how to identify, sample, and recruit ethnic minority samples. Groups, such as ethnic groups, may be classified at a broad level (e.g., Latine or Hispanic) or at a more specific level (e.g., Puerto Rican, Mexican American, etc.); however, these approaches do not need to be mutually exclusive. Researchers must also determine how to classify people of mixed racial or ethnic backgrounds. It is also important to note that it is an incorrect assumption that once people are identified as belonging to a particular ethnic-cultural group, that they share a common understanding of their own ethnicity or culture and identify with the ethnic-cultural group (Okazaki & Sue, 1995).

There are challenges of recruiting a sizeable sample size with specific ethnic-cultural groups (e.g., Japanese Americans). This is partly a challenge of small overall population size (e.g., American Indians). This is also partly a challenge of difficult-to-reach populations who are distrustful of science and researchers. Research has exploited under-privileged populations (e.g., the Tuskegee syphilis study in Black men), which has led to changes in the consent process to ensure that participation is voluntary. There may also be selection effects, such that the people who are willing to participate in research may differ in important ways from those less willing to participate in research, which is a challenge to generalizability. To address sample size challenges, researchers often combine data from multiple ethnic-cultural groups with some common origin. For instance, researchers might combine Chinese Americans, Japanese Americans, and Korean Americans into one group. Or, researchers might combine people from multiple tribal groups among American Indians. However, broadening the ethnic grouping increases the heterogeneity of the groups, so the researcher must decide which are the sources of variability that can and cannot be overlooked (Okazaki & Sue, 1995).

Much of the literature in general and on diversity, in particular, is from research on college students. There are limitations of conducting research on college students. College students are not representative of the broader population. Research using college students under-estimates the population heterogeneity in terms of demographic and psychosocial diversity.

One goal of research is to learn findings that apply to the broader population, with the goal of having a representative sample, where the mean of the sample is approximately the same as the mean of the population. However, having a population-representative sample is not as useful for comparing differences between groups. To do this, we would need large samples of each group—e.g., equally weighted samples. Most research is on participants from WEIRD countries (Henrich et al., 2010): Western, educated, industrialized, rich, and democratic (WEIRD) societies. This further limits the potential generalizability of findings to different cultures.

Provide a thorough description of the sample and sampling methodology used. For instance, it can be helpful to describe participants on additional dimensions to race and ethnicity, including generational status, acculturation, self-identification, ethnic and cultural composition of the neighborhoods or communities, etc. (Okazaki & Sue, 1995). Acculturation deals with the change that individuals undergo due to contact with members of different cultures, and it includes strategies such as assimilation, separation, marginalization, and integration (Suzuki et al., 2013).

25.3.1.6 Establishing Equivalence of Measures Across Groups

Five key areas need to be empirically evaluated in using or adapting instruments across cultures or groups to reduce or eliminate cultural bias (Hunsley & Mash, 2007):

Conceptual equivalence
Linguistic (or translation) equivalence
Psychological equivalence of the items
Functional equivalence
Metric and scalar equivalence (i.e., measurement invariance)

Conceptual equivalence refers to establishing that the construct has the same meaning across groups. Linguistic (or translation) equivalence refers to the measure having the same linguistic meaning in each language to which it has been translated. Best practices for establishing translation equivalence include having experts translate and back-translate the measure, comparing both the original and back-translated versions, and revising the translation accordingly (Okazaki & Sue, 1995). Psychological equivalence of the items deals with establishing that the psychological effect each item has for each group or in the different versions.

Functional equivalence deals with establishing similar predictive and concurrent criterion-related validity to prevent test bias (i.e., similar regression intercepts and slopes between measure and criterion). Functional equivalence can be difficult to establish when both the test and the external criteria of interest are both rooted in the same dominant universalist cultural epistemology or worldview; instead, it may be advantageous to validate tests against external criteria that are grounded in the nondominant group’s culture (Suzuki et al., 2013).

Metric and scalar equivalence (i.e., measurement invariance) deal with establishing that the construct is measured on the same metric across groups, that is, a particular score reflects the same level on the construct for people in each group. Metric equivalence/invariance deals with establishing that items’ discrimination parameters (or factor loadings) are the same across groups. Scalar equivalence/invariance deals with establishing that both items’ discrimination parameters (or factor loadings) and difficulty parameters (or intercepts) are the same across groups.

25.3.1.7 Methods of Assessment

It is debatable whether some assessment methods may be more likely to result in cultural or ethnic bias compared to other assessment methods (Okazaki & Sue, 1995). Certain types of assessment techniques might be more likely to bias or obscure differences between cultural groups. This especially depends on who is doing the observing and their own beliefs. There may be selection biases regarding the types of people who are willing to participate in in-depth studies that involve observational or psychophysiological assessment. It may be worth considering supplementing quantitative measurement approaches with qualitative measurement approaches, such as interviews and life histories, which may be better able to capture cultural factors. In general, it is advantageous to assess people using multiple measures and methods to establish convergent validity of cultural constructs (Okazaki & Sue, 1995).

25.3.1.8 Interpretation of Data

Another challenge deals with how to interpret data from group comparison designs. Historically, group differences in scores were most commonly interpreted from a deficit-based narrative rather than a strengths-based narrative (Byrd et al., 2021).

25.3.2 APA Guidelines

There are many disparities in psychological services for under-served populations, including health disparities, diagnostic disparities, and disparities in the use and availability of psychological services (Rivera Mindt et al., 2010). To help address these disparities, the American Psychological Association (APA) Office of Ethnic Minority Affairs published guidelines for providers of psychological services to ethnic, linguistic, and culturally diverse populations (American Psychological Association Office of Ethnic Minority Affairs, 1993). Their guidelines included suggestions such as:

Whenever possible, provide information in writing along with oral explanations.
Whenever possible, provide the information and interact in the language that is understandable to the client. If that is not feasible, make an appropriate referral. If this is not possible, provide a translator with cultural knowledge and an appropriate professional background. When no translator is available, then use a trained paraprofessional from the client’s culture as a translator.
If you do not possess knowledge and training about an ethnic group, seek consultation with, and/or make referrals to, appropriate experts as necessary.
Consider the validity of a given instrument or procedure and interpret the resulting data, while keeping in mind the cultural and linguistic characteristics of the client.
Be aware of the test’s reference population and possible limitations of using the instrument with other populations.
Seek to help clients determine whether a “problem” stems from racism or bias in others so that the client does not inappropriately personalize problems.

The APA has also published guidelines for working with particular groups, including:

Groups based on race and ethnicity (archived at https://perma.cc/4LM9-X6JN)
People with low income (archived at https://perma.cc/PZQ5-8YFG)
People with disabilities (archived at https://perma.cc/5YZ8-FNJT; PDF: https://perma.cc/KYY4-CQFQ)
People who are transgender or gender nonconforming (archived at https://perma.cc/E9VH-QG3W)
Sexual minorities (archived at https://perma.cc/4BJD-QZCF)
Girls and women (archived at https://perma.cc/AYA4-KB5W)
Boys and men (archived at https://perma.cc/6HWK-W9R6)
Older adults (archived at https://perma.cc/EQ95-Y78Y)

25.3.3 Seek Training and Consultation

Research has shown that relatively few practitioners receive extensive training in multicultural assessment and few use frameworks for multicultural assessment (Brickman et al., 2006; Edwards et al., 2017). Researchers have encouraged increasing multicultural education, training, awareness, and knowledge, increasing multicultural psychological research, and increasing the provision of culturally competent psychological services to ethnic minorities (Rivera Mindt et al., 2010). It can also be worthwhile to consult with informal support systems, including churches, ethnic clubs, family associations, and community leaders (Leong & Kalibatseva, 2016).

25.3.4 Develop Awareness of Cultural Differences

One goal should be to develop cultural competence with the groups, communities, and individuals with which one works or interacts. It can be helpful to be aware of cultural differences, including cultural differences in communication style and use of language (Leong & Kalibatseva, 2016).

25.3.5 Show Cultural Humility

Another goal should be to show cultural humility. Tervalon & Murray-Garcia (1998) describe three key aspects of cultural humility: (1) a life-long commitment to self-evaluation and self-critique, (2) fixing power imbalances, and (3) developing mutually beneficial and nonpaternalistic partnerships with communities who advocate for others. For instance, it is important to be aware of one’s biases and stereotypical beliefs. Cultural humility involves treating the client as the expert on their experience. Therefore, cultural humility involves collaboration between the clinician and client, because each brings important knowledge. The client brings their knowledge of their situation, and their preferences. The clinician brings their scientific knowledge about strengths and difficulties, and about behavior change.

25.3.6 Do Not Assume or Stereotype

It is important to consider cultural factors; at the same time, groups are not monolithic and every person is unique. Do not assume about someone based on their demographics, which is stereotyping. Just because a person comes from a particular group does not mean the person shares all beliefs, behaviors, etc. with the group (Okazaki & Sue, 1995). Thus, it is important to be aware of within-group differences, as well (Leong & Kalibatseva, 2016; Okazaki & Sue, 1995), which are often as large or larger than between-group differences (Zuckerman, 1990). Show genuine interest and curiosity to get to know a person as an individual.

25.3.7 Ensure Adequate Representation of Under-Represented Groups in Research

There are important issues of generalizability of research findings. Participant samples historically have been largely White and middle or upper middle class. This reduces the potential generalizability of findings, and limits effectiveness of treatments to under-represented populations, many of whom have experienced discrimination and social disadvantage. Many groups, including people who are Black or Latine, and gender and sexual minorities have not been adequately represented in research studies. As a result, the knowledge gained from unrepresentative studies may not generalize to other under-represented groups. Science plays an important role in advancing knowledge that can lead to the development of new treatments. However, groups who are under-represented in research are less likely to benefit from the new treatments because the knowledge upon which the treatments were developed and tested may not generalize to the under-represented groups. Moreover, the lack of inclusion of under-represented groups in studies leads to inappropriate normative data. Thus, it is crucial to have better representation of historically marginalized and minoritized groups in science, including people who are Black or Latine, and gender and sexual minorities.

People from Finland and Sweden are often studied to examine health-related questions. Because they have a socialized health care system, the government keeps medical records on all citizens, so researchers can examine millions of participants, and can study rare conditions and behaviors with a low base rate, like suicide (Lysell et al., 2018). However, there are questions of generalizability from the Scandinavian population and health care system to those of other countries, including the United States. One could hypothetically use propensity score matching for matching their sample to other populations, but one cannot really do that because they have too many empty cells for propensity score matching due to too few people of various under-represented racial/ethnic groups.

25.3.8 Use Appropriate Measurement

25.3.8.1 Do Not Use Offensive Content or Procedures

It is important to use stimuli that are valid for the populations being tested. It is important to avoid using content or procedures that are culturally or racially insensitive, or that are racist, sexist, heteronormative, and/or ableist (Byrd et al., 2021). When developing assessments, it may be important to evaluate stimuli in focus groups to ensure they are not offensive.

One of the most widely used standardized assessments in psychological assessment is the Boston Naming Test, a measure of visual confrontation naming that is used to assess cognitive impairment, aphasia, and dementia. The Boston Naming Test includes a noose as one of the items. Historically, lynchings with a noose were used in the United States as a form of racial terror against Black people. Thus, the use of a noose as an item in the Boston Naming Test is racially insensitive and inappropriate for use because it is deeply offensive and has a harmful nature. It can have a strong negative impact on Black respondents in terms of distress, performance disparities, and health disparities (Byrd et al., 2021).

Culturally offensive stimuli can lead to stereotype threat and compromise the validity of the assessment. Stereotype threat occurs when people are or feel at risk of conforming themselves to stereotypes about their social group, thus leading them to show poorer performance in ways that are consistent with the stereotype. Moreover, the use of offensive procedures likely contributes to greater distrust of the scientific and medical fields among under-served communities, which acts as a barrier to health care and to health disparities. Thus, removing and replacing offensive stimuli may be necessary from a moral and ethical perspective to do no harm and to ensure valid assessment, even if it risks modifying the standardized procedure from which the norms were developed (Byrd et al., 2021). When deviating from standardized procedures, it is important to describe such deviations in resulting papers or reports.

25.3.8.2 Use Culturally Valid Measures

Most measures originated from Western societies (Suzuki et al., 2013) and have been validated only among non-Latine, English-speaking Whites (Manly, 2005). It is important to use measures that have been validated for the populations with whom they will be used. Many measures have been used for different populations than the populations with which the instruments were developed and validated, such as with immigrants, refugees, or when exporting measures to different countries or cultural contexts. Exporting measures for use with people from different cultures assumes universality (i.e., cross-cultural validity). More work should examine adaptation and validation of measures for other cultures, countries, languages, etc. (Leong & Kalibatseva, 2016). However, some measures may not be able to be applied validly across cultures. Using valid measures for the target population may in some cases mean using culturally specific (etic) instruments rather than universal (emic) instruments. If using measures whose validity and measurement equivalence has not been established, it is important to interpret results with caution and to provide disclaimers in any resulting papers or reports (Leong & Kalibatseva, 2016).

As described in Section 5.3.1.16, it is also important for measures to have cultural validity, which refers to “the effectiveness of a measure or the accuracy of a clinical diagnosis to address the existence and importance of essential cultural factors.” (Leong & Kalibatseva, 2016, p. 58). Essential cultural factors may include values, beliefs, experiences, communication patterns, and approaches to knowledge (epistemologies).

Threats to cultural validity of assessments includes: pathoplasticity of psychological disorders, bias in clinical judgment, language capability of the client, biased measures, or inappropriate use or interpretation of measures (Leong & Kalibatseva, 2016).

25.3.8.3 Use Linguistically Appropriate Assessment

It is important to use assessment approaches that are linguistically appropriate for the examinee (Leong & Kalibatseva, 2016). First, the measure should be in a language that is understood by the examinee. Second, the measure should be consistent with the examinee’s language ability. Sometimes interpreters or translators may be necessary. However, use of interpreters has been shown to lead to under-estimations of patients’ emotion suffering and despair (Leong & Kalibatseva, 2016). In addition, clients and therapists can attach different meanings to the same words. It is thus important not to assume that you know what the client means by various terms and not to assume that the client knows what you mean by various terms.

25.3.8.4 Use Follow-Up As Necessary

As described by Suzuki et al. (2013), it may be necessary to use follow-up procedures to clarify people’s responses to items or to test the limits by readministering items with additional supports (e.g., modifying instructions to involve more or fewer cues, adjusting the pace of information presented, adjusting memory demands, changing the response format) (Sattler & Hoge, 2006). When deviating from standardized procedures, it is important to describe such deviations in resulting papers or reports.

25.3.8.5 Examine and Correct for Test Bias

As described in Chapter 16, there are two types of test bias: predictive test bias and test structure bias. It is important to evaluate the possibility of both when using assessments across groups.

It is important to establish measurement invariance across groups to ensure that scores have the same meaning for each group (Burlew et al., 2019). If measures are non-invariant, the means (i.e., level) and or associations between constructs could appear to (artifactually) differ across groups, when in reality they do not (F. F. Chen, 2008). Measurement invariance can be examined in terms of item intercepts, item factor loadings, and item residuals. Establishing measurement invariance of just intercepts or factor loadings (or vice versa) is not sufficient to establish measurement invariance because tests can show strong bias in one dimension (e.g., intercepts) even if the test does not show bias in other dimensions (e.g., factor loadings) (Wicherts & Dolan, 2010). Approaches for evaluating measurement invariance across groups are described by Han et al. (2019). To help minimize the potential for test bias, reseachers are encouraged to develop the scales simultaneously across the different groups or cultures (Fernández & Abe, 2018).

It is also important to determine whether the measure’s scores show similar relations to meaningful external criteria (Burlew et al., 2019).

Fernández & Abe (2018) provide strategies to address cross-cultural bias, such as excluding words or concepts that are specific to one language or culture. Use of North American neuropsychological tests in other countries has shown to lead to considerable false positives, and the extent of misdiagnosis differs by country (Daugherty et al., 2017).

It is important to examine factors that moderate the validity of assessment, e.g., groups for whom the assessment is biased. Ways to investigate and detect bias are described in Section 16.2. The number of potential moderating variables is so large, that it is not realistic to develop an assessment base that encompasses all of these factors and their interactions. Evidence should be used regarding which factors seem to matter for a given assessment purpose—that is, the factors that have a medium-to-large effect size. It is valuable to examine the effect size of moderators; if the effects are small, it might not be sufficient to scrap the instrument for the population. Gender and cultural differences have shown a number of statistically significant effects for a number of different assessment purposes, but many of the observed effects are quite small and likely trivial, and they do not present compelling reasons to change the assessment (Youngstrom & Van Meter, 2016). Nevertheless, if you do find evidence of bias, it is important to correct for it. Ways to correct for bias using score adjustment and other approaches are discussed in Section 16.5.

25.3.9 Consider Pathoplasticity of Constructs

Constructs can manifest differently across cultures. Pathoplasticity of mental disorders refers to the variability in symptoms, course, outcome, and distribution of mental disorders among various cultural groups (Leong & Kalibatseva, 2016). Some syndromes are considered culture-bound—that is, they exist in some cultures but not in other cultures. Hwabyung is a Korean culture-bound syndrome in which suppressed negative emotions manifest themselves in bodily ways such as chest pains and difficulty breathing. In addition, features associated with a disorder, including the content, severity, or frequency of symptoms, may vary across cultures.

For instance, Asian Americans tend to report a greater degree of psychological disturbance than Whites, which could be due to several reasons (Leong & Kalibatseva, 2016). It is possible that Asian Americans (a) show higher rates of mental health problems than other groups, (b) under-utilize mental health services, and/or (c) are more likely than other groups to be misdiagnosed due to miscommunication, use of invalid instruments for Asian Americans, and/or lack of cultural knowledge on the part of therapists. In addition, how clients act in a diagnostic interview may depend on cultural factors, and it thus may be important to sample the clients more broadly than a diagnostic interview (Leong & Kalibatseva, 2016).

For a case example of pathoplasticity, there is a disorder among the traditional Navajo called “Moth Madness” (American Psychological Association Office of Ethnic Minority Affairs, 1993). Symptoms of Moth Madness include seizure-like behaviors. This disorder is believed by the Navajo to be the supernatural result of incestuous thoughts or behaviors. Both differential diagnosis and intervention should take into consideration the traditional values of Moth Madness.

25.3.10 Formulate a Culturally Informed Case Conceptualization

It can be helpful to discuss your diagnosis and case conceptualization with the client. For instance, you may learn that some behaviors are normative in their culture and may not be distressing or concerning to them. It can thus be helpful to consider client preferences in terms of what to address and how to address it, while adhering to evidence-based approaches.

25.3.11 Bias in Clinical Judgment

As reviewed by Garb (1997), there is evidence of race bias, social class bias, and gender bias in clinical judgment. Race bias, social class bias, and gender bias in clinical judgment occurs when the accuracy of judgments varies as a function of client race, social class, or gender. That is, clinical judgment bias occurs when the accuracy of judgments varies as function of other variables (e.g., race, social class, gender). Bias is not when the judgments themselves vary as a function of other factors, but when the accuracy of those judgments varies. Examining how judgments differ across groups does not control for differences in clinical severity between the groups. If a clinician rates more people in one group as having a disorder than another group, this is not necessarily evidence of bias because the true prevalence of the condition could differ across groups. Thus, it is important to compare groups with similar levels of symptomatology or to control for level of symptomatology. If clinical judgments are more accurate for White clients than Black clients, we would say that race bias exists.

Accuracy of judgments could vary by the client’s race, social class, or gender for several reasons (Garb, 1997). One way in which judgments could be biased is if the diagnostic criteria are biased—for instance, if the diagnoses are more valid for one group (e.g., White clients) than for another group (e.g., Black clients). Another reason that accuracy of judgments could vary as a function of client race, social class, or gender is if the judgments are made based on a biased instrument. Another way that accuracy of judgments could vary as a function of client race, social class, or gender is related to confirmation bias or biases introduced during the usage of a measure. This could occur, for instance if clinicians do not consider alternative hypotheses, and based on their hypotheses, they choose the data to collect that support their hypotheses rather than collecting data that could refute their hypotheses. Bias could also occur in the way that data are integrated from multiple measures or methods. Another form of bias could arise from how test results are used, in terms of inequitable social consequences (Suzuki et al., 2013).

Many studies do not find bias in clinical judgments as a function of race, social class, or gender, but some studies do find evidence of bias (Garb, 1997).

25.3.11.1 Race Bias

One pattern of race bias is that Black and Latine patients with a psychotic affective disorder are more likely than White patients to be misdiagnosed as having schizophrenia (Garb, 1997). That is, Black and Latine patients are less likely than White patients to be diagnosed with psychotic affective disorder (with the same symptoms) and are more likely to be diagnosed as having schizophrenia, even when the measures do not indicate that a diagnosis of schizophrenia is justified. Another pattern of race bias is that the risk of violence is over-estimated for Black patients but not for White patients (Garb, 1997). In addition, cases of child abuse are more likely to be reported when children are White than when they are Black (even when comparing similar behavior) (Garb, 1997). Race bias has also been observed with respect to antipsychotic medication. Antipsychotic medications are more often prescribed for Black patients than for other patients even for similar levels of psychosis (Garb, 1997). Moreover, affective symptoms in patients who are severely ill are more often under-treated for Black or Latine patients than for White patients (Garb, 1997). Another potential pattern of race bias is that, compared to non-Hispanic White children, racial/ethnic minorities appear to be more likely to be diagnosed with a disruptive behavior disorder rather than attention-deficit/hyperactivity disorder (ADHD) (Fadus et al., 2020); however, it is not yet known whether this holds when comparing individuals with the same symptom presentation.

25.3.11.2 Social Class Bias

One pattern of social class bias is that child abuse is more likely to be reported if a child is from a lower-class background rather than a middle- or upper-class background (even when comparing similar behavior) (Garb, 1997). In addition, middle-class children are more likely to be referred to special education programs compared to lower-class children (Garb, 1997). There is also social class bias in clinicians’ decisions about psychotherapy. Referrals for psychotherapy are more often made for middle-class clients than for lower-class clients (Garb, 1997). Clinicians are more likely to recommend psychotherapy and expect a client to do well in psychotherapy when the client is from a middle- or upper-class background compared to when the client is from a lower-class background, leading to a tendency for clinicians to refer clients to different types of treatments based on the client’s social class (Garb, 1997). Specifically, lower-class clients are more likely to be referred for supportive psychotherapy, whereas middle-class clients are more likely to be referred for insight-oriented psychotherapy.

25.3.11.3 Gender Bias

One pattern of gender bias is that women are more likely than men to be diagnosed with a histrionic personality disorder, whereas men are more likely than women to be diagnosed with antisocial personality disorder, even when female and male clients do not differ in symptomatology (Garb, 1997). Another pattern is that clinicians tend to over-estimate likelihood of violent behavior for male clients and under-estimate likelihood of violent behavior for female clients (Garb, 1997; McNiel & Binder, 1995). Although males tend to show more violence than females, studies have shown that false positives tend to be more common for male patients than female patients, whereas false negatives tend to be more common for female patients than male patients (McNiel & Binder, 1995).

In terms of autism, some conceptualizations of autism consider it to be a manifestation of “extreme male brain” (Baron-Cohen, 2002, 2010; Greenberg et al., 2018). Girls diagnosed with autism tend to be more severely affected than boys diagnosed with autism. People are less likely to notice autism in girls. This could possibly be due to girls with autism having fewer restricted interests and lower levels of repetitive behavior compared to boys. Research has tried to develop assessments to equate boys and girls, but girls and boys appear to be unequal in their underlying autism process—that is, boys and girls appear to show sex-specific autism phenotypes (Frazier et al., 2014).

25.3.11.4 Recommendations for Reducing Bias in Clinical Judgment

Below are recommendations for reducing bias in clinical judgment (Garb, 1997), including:

Be aware of biases that have been reported in the clinical judgment literature.
Give the clinician timely and specific feedback on their judgments.
Use standardized and structured assessment approaches and attend strictly to diagnostic criteria when making diagnoses.
Use actuarial approaches and statistical prediction rules.

25.3.12 Use Appropriate Normative Data

When group-referenced judgments will be used, it is important to consider what the appropriate norm is. The question of whether separate norms should be used for various racial/ethnic minority groups is of considerable controversy (Burlew et al., 2019). Pros and cons of using group-specific norms are discussed in Section 16.5.2.2.2 and reviewed by Manly (2005). The question about which norms to use is complex, and psychologists should evaluate the cost and benefit of each norm, and use the norm with the greatest benefit and the least cost for the client (Manly & Echemendia, 2007). Burlew et al. (2019) recommend limiting the use of measures that require the use of norms to interpret, and to note in the limitations of a paper or report if a norm based on one group is used to evaluate a person from a different group.

25.3.13 Contextualize Group-Related Differences

When conducting research and identifying different scores between groups, it is important to be cautious not to assume that differences in the scores between groups are caused by the group membership. That is, correlation does not necessarily mean causation! For instance, if the researcher identifies higher rates of suicide in transgender than cisgender individuals (as has been observed, Toomey et al., 2018), it is important not to automatically assume the higher suicide rate is because of the person’s transgender status, per se. Rather, there are likely many aspects of discrimination and victimization that transgender individuals experience that, unfortunately, may increase their suicide rate relative to cisgender individuals.

It is important to contextualize test score differences between groups within sociocultural and historical inequities rather than interpreting group-related differences in tests scores from a deficit-based narrative (Byrd et al., 2021). It is important to avoid interpretive bias and to pay attention not to over- or under-pathologize (Okazaki & Sue, 1995). Over-pathologizing can lead to stigmatization and institutionalization. Under-estimating psychopathology can deprive patients of treatments or services that they would benefit from. For instance, one should be aware of the danger of under-estimating psychopathology for clients from different cultures (from that of the therapist) through over-attributing bizarre behavior or thought patterns to the client’s culture (Okazaki & Sue, 1995).

25.3.14 Dissemination and Implementation

Dissemination and implementation deals not just with finding differences between groups, but what to do about these differences. Aspirationally, we would successfully be able to custom-tailor treatments to individuals based on their individual characteristics. The goal is to identify the best treatment for each group, or ideally, each individual.

25.3.15 Use Inclusive and Bias-Free Language

It is important to use bias-free language, as discussed in the Publication manual of the American Psychological Association [APA; American Psychological Association (2020)]. Guidelines for bias-free language include the following:

Describe at the appropriate level of specificity
1. Focus on relevant characteristics
2. Acknowledge relevant differences that exist
3. Be appropriately specific
Be sensitive to labels
1. Acknowledge people’s humanity
2. Provide operational definitions and labels
3. Avoid false hierarchies

When writing reports, manuscripts, or in conversation, it can be helpful to use inclusive and person-first language. The aim of people-first language is to reduce stigma because it treats everyone as a human first. People-first language puts the person before the disability and describes the difficulties a person has rather than who a person is. For instance, say “a child with intellectual disability” instead of “an intellectually disabled child”. That said, some individuals may prefer identity-first labels (e.g., autistic person) rather than person-first labels for their own identities. Thus, it is important to use the labels preferred by the particular client. The APA guidelines for inclusive language are located here: https://www.apa.org/about/apa/equity-diversity-inclusion/language-guide.pdf (archived at https://perma.cc/82QH-ED96).

25.4 Conclusion

Diverse samples provide stronger tests of theories and better generalizability of findings (external validity). We need measures that are reliable and valid for all people we use them with. There are various multicultural frameworks to assessment. Guidelines for assessment of diverse populations are discussed. Recommendations include developing and using appropriate and valid measures for the target population(s), ensuring fair group comparisons, seeking training and consultation, developing awareness of cultural differences, showing cultural humility, avoiding assuming or stereotyping, ensuring adequate representation of under-represented minorities, considering pathoplasticity of constructs, formulating a culturally informed case conceptualization, using appropriate normative data, contextualizing group-related differences, and using inclusive language.

25.5 Suggested Readings

Burlew et al. (2019); Byrd et al. (2021); Okazaki & Sue (1995)

American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.).

American Psychological Association Office of Ethnic Minority Affairs. (1993). Guidelines for providers of psychological services to ethnic, linguistic, and culturally diverse populations. American Psychologist, 48(1), 45–48. https://doi.org/10.1037/0003-066X.48.1.45

Baron-Cohen, S. (2002). The extreme male brain theory of autism. Trends in Cognitive Sciences, 6(6), 248–254. https://doi.org/10.1016/S1364-6613(02)01904-6

Baron-Cohen, S. (2010). Empathizing, systemizing, and the extreme male brain theory of autism. In I. Savic (Ed.), Progress in brain research (Vol. 186, pp. 167–175). Elsevier.

Brickman, A. M., Cabo, R., & Manly, J. J. (2006). Ethical issues in cross-cultural neuropsychology. Applied Neuropsychology, 13(2), 91–100. https://doi.org/10.1207/s15324826an1302_4

Burlew, A. K., Peteet, B. J., McCuistian, C., & Miller-Roenigk, B. D. (2019). Best practices for researching diverse groups. American Journal of Orthopsychiatry, 89(3), 354–368. https://doi.org/10.1037/ort0000350

Byrd, D. A., Rivera Mindt, M. M., Clark, U. S., Clarke, Y., Thames, A. D., Gammada, E. Z., & Manly, J. J. (2021). Creating an antiracist psychology by addressing professional complicity in psychological assessment. Psychological Assessment, 33(3), 279–285. https://doi.org/10.1037/pas0000993

Chen, F. F. (2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95(5), 1005–1018. https://doi.org/10.1037/a0013193

Chen, F. R., & Jaffee, S. R. (2015). The heterogeneity in the development of homotypic and heterotypic antisocial behavior. Journal of Developmental and Life-Course Criminology, 1(3), 269–288. https://doi.org/10.1007/s40865-015-0012-3

Dana, R. H. (1998). Multicultural assessment of personality and psychopathology in the United States: Still art, not yet science, and controversial. European Journal of Psychological Assessment, 14(1), 62–70. https://doi.org/10.1027/1015-5759.14.1.62

Daugherty, J. C., Puente, A. E., Fasfous, A. F., Hidalgo-Ruzzante, N., & Pérez-Garcia, M. (2017). Diagnostic mistakes of culturally diverse individuals when using North American neuropsychological tests. Applied Neuropsychology: Adult, 24(1), 16–22. https://doi.org/10.1080/23279095.2015.1036992

Eaton, W. W. (1980). The sociology of mental disorders. Praeger.

Edwards, L. M., Burkard, A. W., Adams, H. A., & Newcomb, S. A. (2017). A mixed-method study of psychologists’ use of multicultural assessment. Professional Psychology: Research and Practice, 48(2), 131–138. https://doi.org/10.1037/pro0000095

Executive Board of the American Anthropological Association. (1998). AAA statement on race. American Anthropologist, 100(3), 712–713. https://doi.org/10.1525/aa.1998.100.3.712

Fadus, M. C., Ginsburg, K. R., Sobowale, K., Halliday-Boykins, C. A., Bryant, B. E., Gray, K. M., & Squeglia, L. M. (2020). Unconscious bias and the diagnosis of disruptive behavior disorders and ADHD in african american and hispanic youth. Academic Psychiatry, 44(1), 95–102. https://doi.org/10.1007/s40596-019-01127-6

Fernández, A. L., & Abe, J. (2018). Bias in cross-cultural neuropsychological testing: Problems and possible solutions. Culture and Brain, 6(1), 1–35. https://doi.org/10.1007/s40167-017-0050-2

Frazier, T. W., Georgiades, S., Bishop, S. L., & Hardan, A. Y. (2014). Behavioral and cognitive characteristics of females and males with autism in the simons simplex collection. Journal of the American Academy of Child & Adolescent Psychiatry, 53(3), 329–340.e3. https://doi.org/10.1016/j.jaac.2013.12.004

Garb, H. N. (1997). Race bias, social class bias, and gender bias in clinical judgment. Clinical Psychology: Science and Practice, 4(2), 99–120. https://doi.org/10.1111/j.1468-2850.1997.tb00104.x

Greenberg, D. M., Warrier, V., Allison, C., & Baron-Cohen, S. (2018). Testing the empathizing–systemizing theory of sex differences and the extreme male brain theory of autism in half a million people. Proceedings of the National Academy of Sciences, 115(48), 12152–12157. https://doi.org/10.1073/pnas.1811032115

Han, K., Colarelli, S. M., & Weed, N. C. (2019). Methodological and statistical advances in the consideration of cultural diversity in assessment: A critical review of group classification and measurement invariance testing. Psychological Assessment, 31(12), 1481–1496. https://doi.org/10.1037/pas0000731

Hays, P. A. (2016). Addressing cultural complexities in practice: Assessment, diagnosis, and therapy. American Psychological Association.

Helms, J. E., Jernigan, M., & Mascher, J. (2005). The meaning of race in psychology and how to change it: A methodological perspective. American Psychologist, 60(1), 27–36. https://doi.org/10.1037/0003-066X.60.1.27

Henrich, J., Heine, S. J., & Norenzayan, A. (2010). Most people are not WEIRD. Nature, 466(7302), 29–29. https://doi.org/10.1038/466029a

Hunsley, J., & Mash, E. J. (2007). Evidence-based assessment. Annual Review of Clinical Psychology, 3, 29–51. https://doi.org/10.1146/annurev.clinpsy.3.022806.091419

Leong, F. T. L., & Kalibatseva, Z. (2016). Threats to cultural validity in clinical diagnosis and assessment: Illustrated with the case of Asian Americans. In N. Zane, G. Bernal, & F. T. L. Leong (Eds.), Evidence-based psychological practice with ethnic minorities: Culturally informed research and clinical strategies (pp. 57–74). American Psychological Association.

Lewis-Fernández, R., Aggarwal, N. K., Bäärnhielm, S., Rohlof, H., Kirmayer, L. J., Weiss, M. G., Jadhav, S., Hinton, L., Alarcón, R. D., Bhugra, D., Groen, S., Dijk, R. van, Qureshi, A., Collazos, F., Rousseau, C., Caballero, L., Ramos, M., & Lu, F. (2014). Culture and psychiatric evaluation: Operationalizing cultural formulation for DSM-5. Psychiatry: Interpersonal and Biological Processes, 77(2), 130–154. https://doi.org/10.1521/psyc.2014.77.2.130

Lysell, H., Dahlin, M., Viktorin, A., Ljungberg, E., D’Onofrio, B. M., Dickman, P., & Runeson, B. (2018). Maternal suicide – register based study of all suicides occurring after delivery in sweden 1974–2009. PLOS ONE, 13(1), e0190133. https://doi.org/10.1371/journal.pone.0190133

Manly, J. J. (2005). Advantages and disadvantages of separate norms for African Americans. The Clinical Neuropsychologist, 19(2), 270–275. https://doi.org/10.1080/13854040590945346

Manly, J. J., & Echemendia, R. J. (2007). Race-specific norms: Using the model of hypertension to understand issues of race, culture, and education in neuropsychology. Archives of Clinical Neuropsychology, 22(3), 319–325. https://doi.org/10.1016/j.acn.2007.01.006

McNiel, D. E., & Binder, R. L. (1995). Correlates of accuracy in the assessment of psychiatric inpatients’ risk of violence. American Journal of Psychiatry, 152(6), 901–906. https://doi.org/10.1176/ajp.152.6.901

Miller, J. L., Vaillancourt, T., & Boyle, M. H. (2009). Examining the heterotypic continuity of aggression using teacher reports: Results from a national Canadian study. Social Development, 18(1), 164–180. https://doi.org/10.1111/j.1467-9507.2008.00480.x

Moffitt, T. E. (1993). Adolescence-limited and life-course-persistent antisocial behavior: A developmental taxonomy. Psychological Review, 100(4), 674–701. https://doi.org/10.1037/0033-295X.100.4.674

Okazaki, S., & Sue, S. (1995). Methodological issues in assessment research with ethnic minorities. Psychological Assessment, 7(3), 367–375. https://doi.org/10.1037/1040-3590.7.3.367

Patterson, G. R. (1993). Orderly change in a stable world: The antisocial trait as a chimera. Journal of Consulting and Clinical Psychology, 61(6), 911–919. https://doi.org/10.1037/0022-006X.61.6.911

Petersen, I. T., Bates, J. E., Dodge, K. A., Lansford, J. E., & Pettit, G. S. (2015). Describing and predicting developmental profiles of externalizing problems from childhood to adulthood. Development and Psychopathology, 27(3), 791–818. https://doi.org/10.1017/S0954579414000789

Petersen, I. T., & LeBeau, B. (2022). Creating a developmental scale to chart the development of psychopathology with different informants and measures across time. Journal of Psychopathology and Clinical Science, 131(6), 611–625. https://doi.org/10.1037/abn0000649

Ridley, C. R., Hill, C. L., & Wiese, D. L. (2001). Ethics in multicultural assessment a model of reasoned application. In D. L. Wiese (Ed.), Handbook of multicultural assessment: Clinical, psychological, and educational applications (p. 29).

Ridley, C. R., Li, L. C., & Hill, C. L. (1998). Multicultural assessment: Reexamination, reconceptualization, and practical application. The Counseling Psychologist, 26(6), 827–910. https://doi.org/10.1177/0011000098266001

Rivera Mindt, M., Byrd, D., Saez, P., & Manly, J. (2010). Increasing culturally competent neuropsychological services for ethnic minority populations: A call to action. The Clinical Neuropsychologist, 24(3), 429–453. https://doi.org/10.1080/13854040903058960

Sattler, J. M., & Hoge, R. D. (2006). Assessment of children: Behavioral, social, and clinical foundations (5th ed.). Jerome M. Sattler, Publisher, Inc.

Smedley, A., & Smedley, B. D. (2005). Race as biology is fiction, racism as a social problem is real: Anthropological and historical perspectives on the social construction of race. American Psychologist, 60(1), 16–26. https://doi.org/10.1037/0003-066X.60.1.16

Sternberg, R. J., Grigorenko, E. L., & Kidd, K. K. (2005). Intelligence, race, and genetics. American Psychologist, 60(1), 46–59. https://doi.org/10.1037/0003-066x.60.1.46

Suzuki, L. A., Onoue, M. A., & Hill, J. S. (2013). Clinical assessment: A multicultural perspective. In K. F. Geisinger, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 2: Testing and assessment in clinical and counseling psychology (pp. 193–212). American Psychological Association.

Tervalon, M., & Murray-Garcia, J. (1998). Cultural humility versus cultural competence: A critical distinction in defining physician training outcomes in multicultural education. Journal of Health Care for the Poor and Underserved, 9(2), 117–125.

Toomey, R. B., Syvertsen, A. K., & Shramko, M. (2018). Transgender adolescent suicide behavior. Pediatrics, 142(4). https://doi.org/10.1542/peds.2017-4218

Wakschlag, L. S., Tolan, P. H., & Leventhal, B. L. (2010). Research review: “Ain’t misbehavin”: Towards a developmentally-specified nosology for preschool disruptive behavior. Journal of Child Psychology and Psychiatry, 51(1), 3–22. https://doi.org/10.1111/j.1469-7610.2009.02184.x

Wicherts, J. M., & Dolan, C. V. (2010). Measurement invariance in confirmatory factor analysis: An illustration using IQ test performance of minorities. Educational Measurement: Issues and Practice, 29(3), 39–47. https://doi.org/10.1111/j.1745-3992.2010.00182.x

Youngstrom, E. A., & Van Meter, A. (2016). Empirically supported assessment of children and adolescents. Clinical Psychology: Science and Practice, 23(4), 327–347. https://doi.org/10.1111/cpsp.12172

Yudell, M., Roberts, D., DeSalle, R., & Tishkoff, S. (2016). Taking race out of human genetics. Science, 351(6273), 564–565. https://doi.org/10.1126/science.aac4951

Zuckerman, M. (1990). Some dubious premises in research and theory on racial differences: Scientific, social, and ethical issues. American Psychologist, 45(12), 1297–1303. https://doi.org/10.1037/0003-066X.45.12.1297