I need your help!
I want your feedback to make the book better for you and other readers. If you find typos, errors, or places where the text may be improved, please let me know. The best ways to provide feedback are by GitHub or hypothes.is annotations.
Opening an issue or submitting a pull request on GitHub: https://github.com/isaactpetersen/Principles-Psychological-Assessment
Adding an annotation using hypothes.is. To add an annotation, select some text and then click the symbol on the pop-up menu. To see the annotations of others, click the symbol in the upper right-hand corner of the page.
References
Achenbach, T. M. (2001). What are norms and why do we need valid ones? Clinical Psychology: Science and Practice, 8(4), 446–450. https://doi.org/10.1093/clipsy.8.4.446
Ackerman, P. L. (2013). Assessment of intellectual functioning in adults. In K. F. Geisinger, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol 2: Testing and assessment in clinical and counseling psychology (pp. 119–132). American Psychological Association.
Ægisdóttir, S., White, M. J., Spengler, P. M., Maugherman, A. S., Anderson, L. A., Cook, R. S., Nichols, C. N., Lampropoulos, G. K., Walker, B. S., Cohen, G., & Rush, J. D. (2006). The meta-analysis of clinical judgment project: Fifty-six years of accumulated research on clinical versus statistical prediction. The Counseling Psychologist, 34(3), 341–382. https://doi.org/10.1177/0011000005285875
Aguinis, H., Culpepper, S. A., & Pierce, C. A. (2010). Revival of test bias research in preemployment testing. Journal of Applied Psychology, 95(4), 648–680. https://doi.org/10.1037/a0018714
Ahuvia, I. L., Schleider, J. L., Kneeland, E. T., Moser, J. S., & Schroder, H. S. (2024). Depression self-labeling in U.S. College students: Associations with perceived control and coping strategies. Journal of Affective Disorders, 351, 202–210. https://doi.org/10.1016/j.jad.2024.01.229
Allaire, J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., Wickham, H., Cheng, J., Chang, W., & Iannone, R. (2022). rmarkdown: Dynamic documents for R. https://CRAN.R-project.org/package=rmarkdown
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association.
American Psychological Association. (2017). Ethical principles of psychologists and code of conduct.
American Psychological Association. (2020). Publication manual of the American Psychological Association (7th ed.).
American Psychological Association Office of Ethnic Minority Affairs. (1993). Guidelines for providers of psychological services to ethnic, linguistic, and culturally diverse populations. American Psychologist, 48(1), 45–48. https://doi.org/10.1037/0003-066X.48.1.45
Antony, M. M., & Rowa, K. (2005). Evidence-based assessment of anxiety disorders in adults. Psychological Assessment, 17(3), 256–266. https://doi.org/10.1037/1040-3590.17.3.256
Arnett, A., Pennington, B., Willcutt, E., Dmitrieva, J., Byrne, B., Samuelsson, S., & Olson, R. (2012). A cross-lagged model of the development of ADHD inattention symptoms and rapid naming speed. Journal of Abnormal Child Psychology, 40(8), 1313–1326. https://doi.org/10.1007/s10802-012-9644-5
Arvey, R. D., Bouchard, T. J., Carroll, J. B., Cattell, R. B., Cohen, D. B., Dawis, R. V., Detterman, D. K., Dunnette, M., Eysenck, H., Feldman, J. M., Fleishman, E. A., Gilmore, G. C., Gordon, R. A., Gottfredson, L. S., Greene, R. L., Haier, R. J., Hardin, G., Hogan, R., Horn, J. M., … Willerman, L. (1994). Mainstream science on intelligence. Wall Street Journal, 13(1), 18–25.
Atanasov, P., Witkowski, J., Ungar, L., Mellers, B., & Tetlock, P. (2020). Small steps to accuracy: Incremental belief updaters are better forecasters. Organizational Behavior and Human Decision Processes, 160, 19–35. https://doi.org/10.1016/j.obhdp.2020.02.001
Austin, P. C., & Steyerberg, E. W. (2014). Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers. Statistics in Medicine, 33(3), 517–535. https://doi.org/10.1002/sim.5941
Avugos, S., Köppen, J., Czienskowski, U., Raab, M., & Bar-Eli, M. (2013). The “hot hand” reconsidered: A meta-analytic approach. Psychology of Sport and Exercise, 14(1), 21–27. https://doi.org/10.1016/j.psychsport.2012.07.005
Baird, C., & Wagner, D. (2000). The relative validity of actuarial- and consensus-based risk assessment systems. Children and Youth Services Review, 22(11), 839–871. https://doi.org/10.1016/S0190-7409(00)00122-5
Bakeman, R., & Goodman, S. H. (2020). Interobserver reliability in clinical research: Current issues and discussion of how to establish best practices. Journal of Abnormal Psychology, 129(1), 5–13. https://doi.org/10.1037/abn0000487
Baker, F. B., & Kim, S.-H. (2017). The basics of item response theory using R. Springer.
Ballesteros-Pérez, P., González-Cruz, M. C., & Mora-Melià, D. (2018). Explaining the Bayes’ theorem graphically. Proceedings of the International Technology, Education and Development Conference.
Baltes, P. B. (1968). Longitudinal and cross-sectional sequences in the study of age and generation effects. Human Development, 11(3), 145–171. http://www.jstor.org/stable/26761719
Bandalos, D. L. (2018). Measurement theory and applications for the social sciences. Guilford Publications.
Bar-Eli, M., Avugos, S., & Raab, M. (2006). Twenty years of “hot hand” research: Review and critique. Psychology of Sport and Exercise, 7(6), 525–553. https://doi.org/10.1016/j.psychsport.2006.03.001
Baron-Cohen, S. (2002). The extreme male brain theory of autism. Trends in Cognitive Sciences, 6(6), 248–254. https://doi.org/10.1016/S1364-6613(02)01904-6
Baron-Cohen, S. (2010). Empathizing, systemizing, and the extreme male brain theory of autism. In I. Savic (Ed.), Progress in brain research (Vol. 186, pp. 167–175). Elsevier.
Barrash, J., Stillman, A., Anderson, S. W., Uc, E. Y., Dawson, J. D., & Rizzo, M. (2010). Prediction of driving ability with neuropsychological tests: Demographic adjustments diminish accuracy. Journal of the International Neuropsychological Society, 16(4), 679–686. https://doi.org/10.1017/S1355617710000470
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2022). lme4: Linear mixed-effects models using Eigen and S4. https://github.com/lme4/lme4/
Bauer, D. J., Belzak, W. C. M., & Cole, V. T. (2020). Simplifying the assessment of measurement invariance over multiple background variables: Using regularized moderated nonlinear factor analysis to detect differential item functioning. Structural Equation Modeling: A Multidisciplinary Journal, 27(1), 43–55. https://doi.org/10.1080/10705511.2019.1642754
Beaujean, A. A. (2014). Latent variable modeling using R: A step-by-step guide. Routledge.
Beltz, A. M., Wright, A. G. C., Sprague, B. N., & Molenaar, P. C. M. (2016). Bridging the nomothetic and idiographic approaches to the analysis of clinical data. Assessment, 23(4), 447–458. https://doi.org/10.1177/1073191116648209
Belzak, W. C. M., & Bauer, D. J. (2020). Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning. Psychological Methods, 25(6), 673–690. https://doi.org/10.1037/met0000253
Benjamin, L. T. (2005). A history of clinical psychology as a profession in America (and a glimpse of its future). Annual Review of Clinical Psychology, 1, 1–30. https://doi.org/10.1146/annurev.clinpsy.1.102803.143758
Bennett, C. M., Miller, M. B., & Wolford, G. L. (2009). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: An argument for multiple comparisons correction. NeuroImage, 47, S125. https://doi.org/10.1016/S1053-8119(09)71202-9
Bennett, C. M., Miller, M. B., & Wolford, G. L. (2010). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: An argument for multiple comparisons correction. Journal of Serendipitous and Unexpected Results, 1, 1–5. https://teenspecies.github.io/pdfs/NeuralCorrelates.pdf
Benning, S. D., Bachrach, R. L., Smith, E. A., Freeman, A. J., & Wright, A. G. C. (2019). The registration continuum in clinical science: A guide toward transparent practices. Journal of Abnormal Psychology, 128(6), 528–540. https://doi.org/10.1037/abn0000451
Bensch, D., Maaß, U., Greiff, S., Horstmann, K. T., & Ziegler, M. (2019). The nature of faking: A homogeneous and predictable construct? Psychological Assessment, 31(4), 532–544. https://doi.org/10.1037/pas0000619
Berry, D., & Willoughby, M. T. (2017). On the practical interpretability of cross-lagged panel models: Rethinking a developmental workhorse. Child Development, 88(4), 1186–1206. https://doi.org/10.1111/cdev.12660
Bersoff, D. N., DeMatteo, D., & Foster, E. E. (2012). Assessment and testing. In S. J. Knapp (Ed.), APA handbook of ethics in psychology, Vol 2: Practice, teaching, and research (pp. 45–74). American Psychological Association.
Bickel, J. E., & Kim, S. D. (2008). Verification of The Weather Channel probability of precipitation forecasts. Monthly Weather Review, 136(12), 4867–4881. https://doi.org/10.1175/2008MWR2547.1
Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(8476), 307–310. https://doi.org/10.1016/S0140-6736(86)90837-8
Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135–160. https://doi.org/10.1177/096228029900800204
Blashfield, R. K., Keeley, J. W., Flanagan, E. H., & Miles, S. R. (2014). The cycle of classification: DSM-I through DSM-5. Annual Review of Clinical Psychology, 10(1), 25–51. https://doi.org/10.1146/annurev-clinpsy-032813-153639
Blumberg, M. S. (2013). Homology, correspondence, and continuity across development: The case of sleep. Developmental Psychobiology, 55(1), 92–100. https://doi.org/10.1002/dev.21024
Bocskocsky, A., Ezekowitz, J., & Stein, C. (2014). The hot hand: A new approach to an old “fallacy.” MIT Sloan Sports Analytics Conference.
Bolger, F., & Önkal-Atay, D. (2004). The effects of feedback on judgmental interval predictions. International Journal of Forecasting, 20(1), 29–39. https://doi.org/10.1016/S0169-2070(03)00009-8
Bollen, K. A. (1989). Structural equations with latent variables. John Wiley & Sons.
Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53(1), 605–634. https://doi.org/10.1146/annurev.psych.53.100901.135239
Bollen, K. A., & Bauldry, S. (2011). Three Cs in measurement models: Causal indicators, composite indicators, and covariates. Psychological Methods, 16(3), 265–284. https://doi.org/10.1037/a0024448
Bollen, K. A., & Diamantopoulos, A. (2017). In defense of causal-formative indicators: A minority report. Psychological Methods, 22(3), 581–596. https://doi.org/10.1037/met0000056
Bollen, K. A., & Lennox, R. D. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110(2), 305–314. https://doi.org/10.1037/0033-2909.110.2.305
Boring, E. G. (1923). Intelligence as the tests test it. New Republic, 36, 35–37.
Bornstein, R. F. (2011). Toward a process-focused model of test score validity: Improving psychological assessment in science and practice. Psychological Assessment, 23(2), 532–544. https://doi.org/10.1037/a0022402
Borsboom, D. (2003). Conceptual issues in psychological measurement. Universiteit van Amsterdam.
Box, G. E. P. (1979). Robustness in the strategy of scientific model building. In R. L. Launer & G. N. Wilkinson (Eds.), Robustness in statistics. Academic Press.
Brennan, R. L. (1992). Generalizability theory. Educational Measurement: Issues and Practice, 11(4), 27–34. https://doi.org/10.1111/j.1745-3992.1992.tb00260.x
Brennan, R. L. (2001). Generalizability theory. Springer New York. https://books.google.com/books?id=nbHbBwAAQBAJ
Brickman, A. M., Cabo, R., & Manly, J. J. (2006). Ethical issues in cross-cultural neuropsychology. Applied Neuropsychology, 13(2), 91–100. https://doi.org/10.1207/s15324826an1302_4
Brown, R. T., Reynolds, C. R., & Whitaker, J. S. (1999). Bias in mental testing since bias in mental testing. School Psychology Quarterly, 14(3), 208–238. https://doi.org/10.1037/h0089007
Buchanan, T. (2002). Online assessment: Desirable or dangerous? Professional Psychology: Research and Practice, 33(2), 148–154. https://doi.org/10.1037/0735-7028.33.2.148
Burchett, D., & Ben-Porath, Y. S. (2019). Methodological considerations for developing and evaluating response bias indicators. Psychological Assessment, 31(12), 1497–1511. https://doi.org/10.1037/pas0000680
Burisch, M. (1984). Approaches to personality inventory construction: A comparison of merits. American Psychologist, 39, 214–227. https://doi.org/10.1037/0003-066X.39.3.214
Bürkner, P.-C. (2021). Bayesian item response modeling in R with brms and Stan. Journal of Statistical Software, 100(5), 1–54. https://doi.org/10.18637/jss.v100.i05
Burlew, A. K., Peteet, B. J., McCuistian, C., & Miller-Roenigk, B. D. (2019). Best practices for researching diverse groups. American Journal of Orthopsychiatry, 89(3), 354–368. https://doi.org/10.1037/ort0000350
Buros Center for Testing. (2021). The twenty-first mental measurements yearbook. Buros Center for Testing.
Busemeyer, J. R., & Stout, J. C. (2002). A contribution of cognitive decision models to clinical assessment: Decomposing performance on the Bechara gambling task. Psychological Assessment, 14(3), 253–262. https://doi.org/10.1037/1040-3590.14.3.253
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013a). Confidence and precision increase with high statistical power. Nature Reviews Neuroscience, 14(8), 585–585. https://doi.org/10.1038/nrn3475-c4
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013b). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. https://doi.org/10.1038/nrn3475
Byrd, D. A., Rivera Mindt, M. M., Clark, U. S., Clarke, Y., Thames, A. D., Gammada, E. Z., & Manly, J. J. (2021). Creating an antiracist psychology by addressing professional complicity in psychological assessment. Psychological Assessment, 33(3), 279–285. https://doi.org/10.1037/pas0000993
Calamia, M. (2019). Practical considerations for evaluating reliability in ambulatory assessment studies. Psychological Assessment, 31(3), 285–291. https://doi.org/10.1037/pas0000599
Camilli, G. (2013). Ongoing issues in test fairness. Educational Research and Evaluation, 19(2–3), 104–120. https://doi.org/10.1080/13803611.2013.767602
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81–105. https://doi.org/10.1037/h0046016
Campbell, L., Vasquez, M., Behnke, S., & Kinscherff, R. (2010). APA ethics code commentary and case illustrations (pp. v, 392–v, 392). American Psychological Association.
Carlson, S. M., & Zelazo, P. D. (2014). Minnesota executive function scale. Test manual. Reflection Sciences, LLC.
Carpenter, R. W., Wycoff, A. M., & Trull, T. J. (2016). Ambulatory assessment: New adventures in characterizing dynamic processes. Assessment, 23(4), 414–424. https://doi.org/10.1177/1073191116632341
Cashel, M. L. (2002). Child and adolescent psychological assessment: Current clinical practices and the impact of managed care. Professional Psychology: Research and Practice, 33(5), 446–453. https://doi.org/10.1037/0735-7028.33.5.446
Caspi, A., Houts, R. M., Ambler, A., Danese, A., Elliott, M. L., Hariri, A., Harrington, H., Hogan, S., Poulton, R., Ramrakha, S., Rasmussen, L. J. H., Reuben, A., Richmond-Rakerd, L., Sugden, K., Wertz, J., Williams, B. S., & Moffitt, T. E. (2020). Longitudinal assessment of mental health disorders and comorbidities across 4 decades among participants in the Dunedin Birth Cohort Study. JAMA Network Open, 3(4), e203221–e203221. https://doi.org/10.1001/jamanetworkopen.2020.3221
Caspi, A., Houts, R. M., Belsky, D. W., Goldman-Mellor, S. J., Harrington, H., Israel, S., Meier, M. H., Ramrakha, S., Shalev, I., Poulton, R., & Moffitt, T. E. (2014). The p factor: One general psychopathology factor in the structure of psychiatric disorders? Clinical Psychological Science, 2(2), 119–137. https://doi.org/10.1177/2167702613497473
Caspi, A., & Shiner, R. L. (2006). Personality development. In N. Eisenberg, W. Damon, & R. M. Lerner (Eds.), Handbook of child psychology (6th ed., Vol. 3, pp. 300–365). John Wiley & Sons, Inc.
Chalmers, P. (2020). mirt: Multidimensional item response theory. https://CRAN.R-project.org/package=mirt
Chalmers, P. (2021). mirtCAT: Computerized adaptive testing with multidimensional item response theory. https://CRAN.R-project.org/package=mirtCAT
Chandler, J., Sisso, I., & Shapiro, D. (2020). Participant carelessness and fraud: Consequences for clinical research and potential solutions. Journal of Abnormal Psychology, 129(1), 49–55. https://doi.org/10.1037/abn0000479
Charba, J. P., & Klein, W. H. (1980). Skill in precipitation forecasting in the National Weather Service. Bulletin of the American Meteorological Society, 61(12), 1546–1555. https://doi.org/10.1175/1520-0477(1980)061<1546:SIPFIT>2.0.CO;2
Chen, F. F. (2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834
Chen, F. F. (2008). What happens if we compare chopsticks with forks? The impact of making inappropriate comparisons in cross-cultural research. Journal of Personality and Social Psychology, 95(5), 1005–1018. https://doi.org/10.1037/a0013193
Chen, F. R., & Jaffee, S. R. (2015). The heterogeneity in the development of homotypic and heterotypic antisocial behavior. Journal of Developmental and Life-Course Criminology, 1(3), 269–288. https://doi.org/10.1007/s40865-015-0012-3
Chen, Y., Prudêncio, R. B. C., Diethe, T., & Flach, P. (2019). \(\beta\)3-IRT: A new item response model and its applications. arXiv:1903.04016. https://arxiv.org/abs/1903.04016
Cheng, Y., Shao, C., & Lathrop, Q. N. (2016). The mediated MIMIC model for understanding the underlying mechanism of DIF. Educational and Psychological Measurement, 76(1), 43–63. https://doi.org/10.1177/0013164415576187
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233–255. https://doi.org/10.1207/s15328007sem0902_5
Childs, D. Z., Hindle, B. J., & Warren, P. H. (2021). APS 240: Data analysis and statistics with R. https://dzchilds.github.io/stats-for-bio/
Choca, J. P., & Rossini, E. D. (2018). Assessment using the Rorschach inkblot test. American Psychological Association.
Cicchetti, D., & Rogosch, F. A. (2002). A developmental psychopathology perspective on adolescence. Journal of Consulting and Clinical Psychology, 70(1), 6–20. https://doi.org/10.1037/0022-006X.70.1.6
Civelek, M. E. (2018). Essentials of structural equation modeling. Zea E-Books.
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, 309–319. https://doi.org/10.1037/1040-3590.7.3.309
Clark, L. A., & Watson, D. (2019). Constructing validity: New developments in creating objective measuring instruments. Psychological Assessment, 31(12), 1412–1427. https://doi.org/10.1037/pas0000626
Clark, M. J., & Grandy, J. (1984). Sex differences in the academic performance of Scholastic Aptitude Test takers: College board report no. 84-8. College Board Publications.
Clark, S. J., & Desharnais, R. A. (1998). Honest answers to embarrassing questions: Detecting cheating in the randomized response model. Psychological Methods, 3(2), 160–168. https://doi.org/10.1037/1082-989X.3.2.160
Cohen, Z. D., & DeRubeis, R. J. (2018). Treatment selection in depression. Annual Review of Clinical Psychology, 14(1), 209–236. https://doi.org/10.1146/annurev-clinpsy-050817-084746
Cole, N. S. (1981). Bias in testing. American Psychologist, 36(10), 1067–1077. https://doi.org/10.1037/0003-066X.36.10.1067
Cole, V., Gottfredson, N., & Giordano, M. (2018). aMNLFA: Automated fitting of moderated nonlinear factor analysis through the Mplus program. https://CRAN.R-project.org/package=aMNLFA
Committee on the General Aptitude Test Battery, Commission on Behavioral and Social Sciences and Education, & National Research Council. (1989). Fairness in employment testing: Validity generalization, minority issues, and the general aptitude test battery. National Academies Press.
Conradt, E., Crowell, S. E., & Cicchetti, D. (2021). Using development and psychopathology principles to inform the research domain criteria (RDoC) framework. Development and Psychopathology, 33(5), 1521–1525. https://doi.org/10.1017/S0954579421000985
Cooper, L. D., & Balsis, S. (2009). When less is more: How fewer diagnostic criteria can indicate greater severity. Psychological Assessment, 21(3), 285–293. https://doi.org/10.1037/a0016698
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104. https://doi.org/10.1037/0021-9010.78.1.98
Costa Jr., P. T., McCrae, R. R., & Löckenhoff, C. E. (2019). Personality across the life span. Annual Review of Psychology, 70(1), 423–448. https://doi.org/10.1146/annurev-psych-010418-103244
Counsell, A., Cribbie, R. A., & Flora, D. B. (2020). Evaluating equivalence testing methods for measurement invariance. Multivariate Behavioral Research, 55(2), 312–328. https://doi.org/10.1080/00273171.2019.1633617
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. https://doi.org/10.1037/h0040957
Curran, P. J., Howard, A. L., Bainter, S. A., Lane, S. T., & McGinley, J. S. (2014). The separation of between-person and within-person components of individual change over time: A latent curve model with structured residuals. Journal of Consulting and Clinical Psychology, 82, 8–94. https://doi.org/10.1037/a0035297
Dana, J., & Thomas, R. (2006). In defense of clinical judgment … and mechanical prediction. Journal of Behavioral Decision Making, 19(5), 413–428. https://doi.org/10.1002/bdm.537
Dana, R. H. (1998). Multicultural assessment of personality and psychopathology in the United States: Still art, not yet science, and controversial. European Journal of Psychological Assessment, 14(1), 62–70. https://doi.org/10.1027/1015-5759.14.1.62
Daugherty, J. C., Puente, A. E., Fasfous, A. F., Hidalgo-Ruzzante, N., & Pérez-Garcia, M. (2017). Diagnostic mistakes of culturally diverse individuals when using North American neuropsychological tests. Applied Neuropsychology: Adult, 24(1), 16–22. https://doi.org/10.1080/23279095.2015.1036992
Davison, G. C., Vogel, R. S., & Coffman, S. G. (1997). Think-aloud approaches to cognitive assessment and the articulated thoughts in simulated situations paradigm. Journal of Consulting and Clinical Psychology, 65(6), 950–958. https://doi.org/10.1037/0022-006X.65.6.950
Dawes, R. M. (1986). Representative thinking in clinical judgment. Clinical Psychology Review, 6, 425–441. https://doi.org/10.1016/0272-7358(86)90030-9
Dawes, R. M., Faust, D., & Meehl, P. E. (1989). Clinical versus actuarial judgment. Science, 243(4899), 1668–1674. https://doi.org/10.1126/science.2648573
DeRubeis, R. J., Cohen, Z. D., Forand, N. R., Fournier, J. C., Gelfand, L. A., & Lorenzo-Luaces, L. (2014). The personalized advantage index: Translating research on prediction into individualized treatment recommendations. A demonstration. PLoS ONE, 9(1), e83875. https://doi.org/10.1371/journal.pone.0083875
Diamantopoulos, A., Riefler, P., & Roth, K. P. (2008). Advancing formative measurement models. Journal of Business Research, 61(12), 1203–1218. https://doi.org/10.1016/j.jbusres.2008.01.009
Dien, J. (2012). Applying principal components analysis to event-related potentials: A tutorial. Developmental Neuropsychology, 37(6), 497–517. https://doi.org/10.1080/87565641.2012.697503
Digitale, J. C., Martin, J. N., & Glymour, M. M. (2022). Tutorial on directed acyclic graphs. Journal of Clinical Epidemiology, 142, 264–267. https://doi.org/10.1016/j.jclinepi.2021.08.001
Dinno, A. (2014). Gently clarifying the application of Horn’s parallel analysis to principal component analysis versus factor analysis. http://archives.pdx.edu/ds/psu/10527
Dombrowski, S. C., McGill, R. J., & Morgan, G. B. (2021). Monte Carlo modeling of contemporary intelligence test (IQ) factor structure: Implications for IQ assessment, interpretation, and theory. Assessment, 28(3), 977–993. https://doi.org/10.1177/1073191119869828
Dorans, N. J. (2017). Contributions to the quantitative assessment of item, test, and score fairness. In R. E. Bennett & M. von Davier (Eds.), Advancing human assessment (pp. 201–230). Springer, Cham.
Dubois, J., & Adolphs, R. (2016). Building a science of individual differences from fMRI. Trends in Cognitive Sciences, 20(6), 425–443. https://doi.org/10.1016/j.tics.2016.03.014
Dueber, D. (2019). dmacs: Measurement nonequivalence effect size calculator. https://github.com/ddueber/dmacs
Duncan, G. J., Engel, M., Claessens, A., & Dowsett, C. J. (2014). Replication and robustness in developmental research. Developmental Psychology, 50(11), 2417–2425. https://doi.org/10.1037/a0037996
Dunkley, D. M., Segal, Z. V., & Blankstein, K. R. (2019). Cognitive assessment: Issues and methods. In K. S. Dobson & D. J. A. Dozois (Eds.), Handbook of cognitive-behavioral therapies (4th ed., pp. 85–119). Guilford Press.
Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105(3), 399–412. https://doi.org/10.1111/bjop.12046
Dunning, D., Heath, C., & Suls, J. M. (2004). Flawed self-assessment: Implications for health, education, and the workplace. Psychological Science in the Public Interest, 5, 69–106. https://doi.org/10.1111/j.1529-1006.2004.00018.x
Durbin, C. E., Wilson, S., & MacDonald, I., Angus W. (2022). Integrating development into the research domain criteria (RDoC) framework: Introduction to the special section. Journal of Psychopathology and Clinical Science, 131(6), 535–541. https://doi.org/10.1037/abn0000767
Dwyer, D. B., Falkai, P., & Koutsouleris, N. (2018). Machine learning approaches for clinical psychology and psychiatry. Annual Review of Clinical Psychology, 14(1), 91–118. https://doi.org/10.1146/annurev-clinpsy-032816-045037
Eaton, W. W. (1980). The sociology of mental disorders. Praeger.
Eddy, D. M. (1982). Probabilistic reasoning in clinical medicine: Problems and opportunities. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristics and biases (pp. 249–267). Cambridge University Press.
Edwards, J. R. (2011). The fallacy of formative measurement. Organizational Research Methods, 14(2), 370–388. https://doi.org/10.1177/1094428110378369
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5(2), 155–174. https://doi.org/10.1037/1082-989X.5.2.155
Edwards, L. M., Burkard, A. W., Adams, H. A., & Newcomb, S. A. (2017). A mixed-method study of psychologists’ use of multicultural assessment. Professional Psychology: Research and Practice, 48(2), 131–138. https://doi.org/10.1037/pro0000095
Ellard, K. K., Fairholme, C. P., Boisseau, C. L., Farchione, T. J., & Barlow, D. H. (2010). Unified protocol for the transdiagnostic treatment of emotional disorders: Protocol development and initial outcome data. Cognitive and Behavioral Practice, 17(1), 88–101. https://doi.org/10.1016/j.cbpra.2009.06.002
Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349. https://doi.org/10.1037/1040-3590.8.4.341
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists (Vol. 4). Lawrence Erlbaum Associates.
Epskamp, S. (2022). semPlot: Path diagrams and visual analysis of various SEM packages’ output. https://github.com/SachaEpskamp/semPlot
Evans, S. C., & Shaughnessy, S. (in press). Emotion regulation as central to psychopathology across childhood and adolescence: A commentary on Nobakht et al. (2023). Journal of Child Psychology and Psychiatry. https://doi.org/10.1111/jcpp.13910
Executive Board of the American Anthropological Association. (1998). AAA statement on race. American Anthropologist, 100(3), 712–713. https://doi.org/10.1525/aa.1998.100.3.712
Exner, J. E. (1974). The Rorschach: A comprehensive system. John Wiley & Sons.
Exner, J. E., & Erdberg, S. P. (2005). The Rorschach, a comprehensive system: Advanced interpretation (3rd ed., Vol. 2). John Wiley & Sons, Inc.
Fadus, M. C., Ginsburg, K. R., Sobowale, K., Halliday-Boykins, C. A., Bryant, B. E., Gray, K. M., & Squeglia, L. M. (2020). Unconscious bias and the diagnosis of disruptive behavior disorders and ADHD in african american and hispanic youth. Academic Psychiatry, 44(1), 95–102. https://doi.org/10.1007/s40596-019-01127-6
Falotico, R., & Quatto, P. (2010). On avoiding paradoxes in assessing inter-rater agreement. Italian Journal of Applied Statistics, 22, 151–160.
Faraone, S. V., & Tsuang, M. T. (1994). Measuring diagnostic accuracy in the absence of a “gold standard.” American Journal of Psychiatry, 151, 650–657. https://doi.org/10.1176/ajp.151.5.650
Farrington, D. P., & Loeber, R. (1989). Relative improvement over chance (RIOC) and phi as measures of predictive efficiency and strength of association in 2×2 tables. Journal of Quantitative Criminology, 5(3), 201–213. https://doi.org/10.1007/BF01062737
Farris, C., Treat, T. A., Viken, R. J., & McFall, R. M. (2008). Perceptual mechanisms that characterize gender differences in decoding women’s sexual intent. Psychological Science, 19(4), 348–354. https://doi.org/10.1111/j.1467-9280.2008.02092.x
Farris, C., Viken, R. J., Treat, T. A., & McFall, R. M. (2006). Heterosocial perceptual organization: Application of the choice model to sexual coercion. Psychological Science (0956-7976), 17(10), 869–875. https://doi.org/10.1111/j.1467-9280.2006.01796.x
Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using g*power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. https://doi.org/10.3758/brm.41.4.1149
Fernández, A. L., & Abe, J. (2018). Bias in cross-cultural neuropsychological testing: Problems and possible solutions. Culture and Brain, 6(1), 1–35. https://doi.org/10.1007/s40167-017-0050-2
Fiske, D. W., & Campbell, D. T. (1992). Citations do not solve problems. Psychological Bulletin, 112(3), 393–395. https://doi.org/10.1037/0033-2909.112.3.393
Fleck, M. S., Samei, E., & Mitroff, S. R. (2010). Generalized “satisfaction of search”: Adverse influences on dual-target search accuracy. Journal of Experimental Psychology: Applied, 16(1), 60–71. https://doi.org/10.1037/a0018629
Fletcher, R. R., Nakeshimana, A., & Olubeko, O. (2021). Addressing fairness, bias, and appropriate use of artificial intelligence and machine learning in global health. Frontiers in Artificial Intelligence, 3(116). https://doi.org/10.3389/frai.2020.561802
Flora, D. B. (2020). Your coefficient alpha is probably wrong, but which coefficient omega is right? A tutorial on using R to obtain better reliability estimates. Advances in Methods and Practices in Psychological Science, 3(4), 484–501. https://doi.org/10.1177/2515245920951747
Floyd, F. J., & Widaman, K. F. (1995). Factor analysis in the development and refinement of clinical assessment instruments. Psychological Assessment, 7, 286–299. https://doi.org/10.1037/1040-3590.7.3.286
Fok, C. C. T., & Henry, D. (2015). Increasing the sensitivity of measures to change. Prevention Science, 16(7), 978–986. https://doi.org/10.1007/s11121-015-0545-z
Fontaine, N. M. G., & Petersen, I. T. (2017). Developmental trajectories of psychopathology: An overview of approaches and applications. In L. Centifanti & D. Williams (Eds.), The wiley handbook of developmental psychopathology (pp. 5–28). Wiley-Blackwell.
Forbey, J. D., & Ben-Porath, Y. S. (2007). Computerized adaptive personality testing: A review and illustration with the MMPI-2 computerized adaptive version. Psychological Assessment, 19(1), 14–24. https://doi.org/10.1037/1040-3590.19.1.14
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(1), 39–50. https://doi.org/10.2307/3151312
Fox, J., Weisberg, S., & Price, B. (2022). Car: Companion to applied regression. https://CRAN.R-project.org/package=car
Frank, L. K. (1939). Projective methods for the study of personality. Journal of Psychology, 8, 389–413. https://doi.org/10.1080/00223980.1939.9917671
Frazier, T. W., Georgiades, S., Bishop, S. L., & Hardan, A. Y. (2014). Behavioral and cognitive characteristics of females and males with autism in the simons simplex collection. Journal of the American Academy of Child & Adolescent Psychiatry, 53(3), 329–340.e3. https://doi.org/10.1016/j.jaac.2013.12.004
Freese, J., & Peterson, D. (2017). Replication in social science. Annual Review of Sociology.
Freud, S. (1911). Psycho-analytic notes on an autobiographical account of a case of paranoia (dementia paranoides). In J. Strachey (Ed.), The standard edition of the complete psychological works of Sigmund Freud: The case of Schreber, papers on technique and other works, Vol. 12 (1911–1913) (pp. 1–82).
Fried, E. I. (2022). Studying mental health problems as systems, not syndromes. Current Directions in Psychological Science, 31(6), 500–508. https://doi.org/10.1177/09637214221114089
Furr, R. M. (2017). Psychometrics: An introduction. SAGE publications.
Furr, R. M., & Heuckeroth, S. (2019). The “quantifying construct validity” procedure: Its role, value, interpretations, and computation. Assessment, 26(4), 555–566. https://doi.org/10.1177/1073191118820638
Galatzer-Levy, I. R., & Bryant, R. A. (2013). 636,120 ways to have posttraumatic stress disorder. Perspectives on Psychological Science, 8(6), 651–662. https://doi.org/10.1177/1745691613504115
Galatzer-Levy, I. R., & Onnela, J.-P. (2023). Machine learning and the digital measurement of psychological health. Annual Review of Clinical Psychology, 19, 133–154. https://doi.org/10.1146/annurev-clinpsy-080921-073212
Gambrill, E. (2014). The diagnostic and statistical manual of mental disorders as a major form of dehumanization in the modern world. Research on Social Work Practice, 24(1), 13–36. https://doi.org/10.1177/1049731513499411
Gandrud, C. (2020). Reproducible research with R and R studio (3rd ed.). CRC Press. https://www.routledge.com/Reproducible-Research-with-R-and-RStudio/Gandrud/p/book/9780367143985
Garb, H. N. (1997). Race bias, social class bias, and gender bias in clinical judgment. Clinical Psychology: Science and Practice, 4(2), 99–120. https://doi.org/10.1111/j.1468-2850.1997.tb00104.x
Garb, H. N. (2005). Clinical judgment and decision making. Annual Review of Clinical Psychology, 1, 67–89. https://doi.org/10.1146/annurev.clinpsy.1.102803.143810
Garb, H. N. (2007). Computer-administered interviews and rating scales. Psychological Assessment, 19(1), 4–13. https://doi.org/10.1037/1040-3590.19.1.4
Garb, H. N., & Wood, J. M. (2019). Methodological advances in statistical prediction. Psychological Assessment, 31(12), 1456–1466. https://doi.org/10.1037/pas0000673
Garb, H. N., Wood, J. M., Lilienfeld, S. O., & Nezworski, M. T. (2005). Roots of the Rorschach controversy. Clinical Psychology Review, 25(1), 97–118. https://doi.org/10.1016/j.cpr.2004.09.002
Garber, J., & Weersing, V. R. (2010). Comorbidity of anxiety and depression in youth: Implications for treatment and prevention. Clinical Psychology: Science and Practice, 17(4), 293–306. https://doi.org/10.1111/j.1468-2850.2010.01221.x
Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability estimation in a multilevel confirmatory factor analysis framework. Psychological Methods, 19(1), 72–91. https://doi.org/10.1037/a0032138
Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University.
Gibbons, R. D., Weiss, D. J., Frank, E., & Kupfer, D. (2016). Computerized adaptive diagnosis and testing of mental health disorders. Annual Review of Clinical Psychology, 12(1), 83–104. https://doi.org/10.1146/annurev-clinpsy-021815-093634
Gilovich, T., Vallone, R., & Tversky, A. (1985). The hot hand in basketball: On the misperception of random sequences. Cognitive Psychology, 17(3), 295–314. https://doi.org/10.1016/0010-0285(85)90010-6
Gipps, C., & Stobart, G. (2009). Fairness in assessment. In C. Wyatt-Smith & J. J. Cumming (Eds.), Educational assessment in the 21st century: Connecting theory and practice (pp. 105–118). Springer Netherlands. https://doi.org/10.1007/978-1-4020-9964-9_6
Girard, J. M., & Cohn, J. F. (2016). A primer on observational measurement. Assessment, 23(4), 404–413. https://doi.org/10.1177/1073191116635807
Gneiting, T., & Walz, E.-M. (2021). Receiver operating characteristic (ROC) movies, universal ROC (UROC) curves, and coefficient of predictive ability (CPA). Machine Learning. https://doi.org/10.1007/s10994-021-06114-3
Gonzalez, O., & Pelham, W. E. (2021). When does differential item functioning matter for screening? A method for empirical evaluation. Assessment, 28(2), 446–456. https://doi.org/10.1177/1073191120913618
Goodwin, L. D., & Leech, N. L. (2006). Understanding correlation: Factors that affect the size of r. The Journal of Experimental Education, 74(3), 249–266. https://doi.org/10.3200/JEXE.74.3.249-266
Gottfredson, L. S. (1994). The science and politics of race-norming. American Psychologist, 49(11), 955–963. https://doi.org/10.1037/0003-066X.49.11.955
Gottfredson, L. S. (1997). Mainstream science on intelligence: An editorial with 52 signatories, history, and bibliography. Intelligence, 24(1), 13–23.
Gottfredson, N. C., Cole, V. T., Giordano, M. L., Bauer, D. J., Hussong, A. M., & Ennett, S. T. (2019). Simplifying the implementation of modern scale scoring methods with an automated R package: Automated moderated nonlinear factor analysis (aMNLFA). Addictive Behaviors, 94, 65–73. https://doi.org/10.1016/j.addbeh.2018.10.031
Graham, J. M. (2006). Congeneric and (essentially) tau-equivalent estimates of score reliability: What they are and how to use them. Educational and Psychological Measurement, 66(6), 930–944. https://doi.org/10.1177/0013164406288165
Graham, J. R., Veltri, C. O. C., & Lee, T. T. C. (2022). MMPI instruments: Assessing personality and psychopathology (6th ed.). Oxford University Press.
Graham, J., Olchowski, A., & Gilreath, T. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8(3), 206–213. https://doi.org/10.1007/s11121-007-0070-9
Granziol, U., Brancaccio, A., Pizziconi, G., Spangaro, M., Gentili, F., Bosia, M., Gregori, E., Luperini, C., Pavan, C., Santarelli, V., Cavallaro, R., Cremonese, C., Favaro, A., Rossi, A., Vidotto, G., & Spoto, A. (2022). On the implementation of computerized adaptive observations for psychological assessment. Assessment, 29(2), 225–241. https://doi.org/10.1177/1073191120960215
Green, S. B., & Yang, Y. (2015). Evaluation of dimensionality in the assessment of internal consistency reliability: Coefficient alpha and omega coefficients. Educational Measurement: Issues and Practice, 34(4), 14–20. https://doi.org/10.1111/emip.12100
Greenberg, D. M., Warrier, V., Allison, C., & Baron-Cohen, S. (2018). Testing the empathizing–systemizing theory of sex differences and the extreme male brain theory of autism in half a million people. Proceedings of the National Academy of Sciences, 115(48), 12152–12157. https://doi.org/10.1073/pnas.1811032115
Grove, W. M., & Meehl, P. E. (1996). Comparative efficiency of informal (subjective, impressionistic) and formal (mechanical, algorithmic) prediction procedures: The clinical–statistical controversy. Psychology, Public Policy, and Law, 2(2), 293–323. https://doi.org/10.1037/1076-8971.2.2.293
Grove, W. M., Zald, D. H., Lebow, B. S., Snitz, B. E., & Nelson, C. (2000). Clinical versus mechanical prediction: A meta-analysis. Psychological Assessment, 12(1), 19–30. https://doi.org/10.1037/1040-3590.12.1.19
Gunn, H. J., Grimm, K. J., & Edwards, M. C. (2020). Evaluation of six effect size measures of measurement non-invariance for continuous outcomes. Structural Equation Modeling: A Multidisciplinary Journal, 27(4), 503–514. https://doi.org/10.1080/10705511.2019.1689507
Gwet, K. L. (2008). Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology, 61(1), 29–48. https://doi.org/10.1348/000711006X126600
Gwet, K. L. (2021a). Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters, Vol. 1: Analysis of categorical ratings (5th ed.). AgreeStat Analytics.
Gwet, K. L. (2021b). Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters, Vol. 2: Analysis of quantitative ratings (5th ed.). AgreeStat Analytics.
Hagquist, C. (2019). Explaining differential item functioning focusing on the crucial role of external information – an example from the measurement of adolescent mental health. BMC Medical Research Methodology, 19(1), 185. https://doi.org/10.1186/s12874-019-0828-3
Hagquist, C., & Andrich, D. (2017). Recent advances in analysis of differential item functioning in health research using the Rasch model. Health and Quality of Life Outcomes, 15(1), 181. https://doi.org/10.1186/s12955-017-0755-0
Hall, G. C. N., Bansal, A., & Lopez, I. R. (1999). Ethnicity and psychopathology: A meta-analytic review of 31 years of comparative MMPI/MMPI-2 research. Psychological Assessment, 11(2), 186–197. https://doi.org/10.1037/1040-3590.11.2.186
Hamaker, E. L., Kuiper, R. M., & Grasman, R. P. P. P. (2015). A critique of the cross-lagged panel model. Psychological Methods, 20(1), 102–116. https://doi.org/10.1037/a0038889
Han, K., Colarelli, S. M., & Weed, N. C. (2019). Methodological and statistical advances in the consideration of cultural diversity in assessment: A critical review of group classification and measurement invariance testing. Psychological Assessment, 31(12), 1481–1496. https://doi.org/10.1037/pas0000731
Hancock, G. R., & French, B. F. (2013). Power analysis in structural equation modeling. In Structural equation modeling: A second course, 2nd ed. (pp. 117–159). IAP Information Age Publishing.
Hardin, A. M., Chang, J. C.-J., Fuller, M. A., & Torkzadeh, G. (2011). Formative measurement and academic research: In search of measurement theory. Educational and Psychological Measurement, 71(2), 281–305. https://doi.org/10.1177/0013164410370208
Harrell, F. (2015). Regression modeling strategies: With applications to linear models, logistic and ordinal regression, and survival analysis. Springer.
Harrell, Jr., F. E. (2021). rms: Regression modeling strategies. https://CRAN.R-project.org/package=rms
Hayes, A. F., & Coutts, J. J. (2020). Use omega rather than cronbach’s alpha for estimating reliability. but…. Communication Methods and Measures, 14(1), 1–24. https://doi.org/10.1080/19312458.2020.1718629
Hayes, S. C., Nelson, R. O., & Jarrett, R. B. (1987). The treatment utility of assessment: A functional approach to evaluating assessment quality. American Psychologist, 42, 963–974. https://doi.org/10.1037/0003-066X.42.11.963
Haynes, S. N. (2001). Clinical applications of analogue behavioral observation: Dimensions of psychometric evaluation. Psychological Assessment, 13(1), 73–85. https://doi.org/10.1037/1040-3590.13.1.73
Haynes, S. N., & Yoshioka, D. T. (2007). Clinical assessment applications of ambulatory biosensors. Psychological Assessment, 19(1), 44–57. https://doi.org/10.1037/1040-3590.19.1.44
Hays, P. A. (2016). Addressing cultural complexities in practice: Assessment, diagnosis, and therapy. American Psychological Association.
Hedge, C., Powell, G., & Sumner, P. (2018). The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods, 50(3), 1166–1186. https://doi.org/10.3758/s13428-017-0935-1
Helms, J. E. (2006). Fairness is not validity or cultural bias in racial-group assessment: A quantitative perspective. American Psychologist, 61(8), 845–859. https://doi.org/10.1037/0003-066X.61.8.845
Helms, J. E., Jernigan, M., & Mascher, J. (2005). The meaning of race in psychology and how to change it: A methodological perspective. American Psychologist, 60(1), 27–36. https://doi.org/10.1037/0003-066X.60.1.27
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). Most people are not WEIRD. Nature, 466(7302), 29–29. https://doi.org/10.1038/466029a
Henseler, J., Ringle, C. M., & Sarstedt, M. (2015). A new criterion for assessing discriminant validity in variance-based structural equation modeling. Journal of the Academy of Marketing Science, 43(1), 115–135. https://doi.org/10.1007/s11747-014-0403-8
Hertzog, C., & Nesselroade, J. R. (2003). Assessing psychological change in adulthood: An overview of methodological issues. Psychology and Aging, 18(4), 639–657. https://doi.org/10.1037/0882-7974.18.4.639
Himmelstein, P. H., Woods, W. C., & Wright, A. G. C. (2019). A comparison of signal- and event-contingent ambulatory assessment of interpersonal behavior and affect in social situations. Psychological Assessment, 31(7), 952–960. https://doi.org/10.1037/pas0000718
Hinshaw, S. P., & Nigg, J. T. (1999). Behavior rating scales in the assessment of disruptive behavior problems in childhood. In D. Shaffer, C. P. Lucas, & J. E. Richters (Eds.), Diagnostic assessment in child and adolescent psychopathology. (pp. 91–126). The Guilford Press.
Hoch, S. J. (1985). Counterfactual reasoning and accuracy in predicting personal events. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11(4), 719–731. https://doi.org/10.1037/0278-7393.11.1-4.719
Holmlund, T. B., Foltz, P. W., Cohen, A. S., Johansen, H. D., Sigurdsen, R., Fugelli, P., Bergsager, D., Cheng, J., Bernstein, J., Rosenfeld, E., & Elvevåg, B. (2019). Moving psychological assessment out of the controlled laboratory setting: Practical challenges. Psychological Assessment, 31(3), 292–303. https://doi.org/10.1037/pas0000647
Hough, S. E. (2016). Predicting the unpredictable: The tumultuous science of earthquake prediction. Princeton University Press.
Hove, D. ten, Jorgensen, T. D., & Ark, L. A. van der. (2022). Interrater reliability for multilevel data: A generalizability theory approach. Psychological Methods, 27(4), 650–666. https://doi.org/10.1037/met0000391
Howell, R. D., Breivik, E., & Wilcox, J. B. (2007). Reconsidering formative measurement. Psychological Methods, 12(2), 205–218. https://doi.org/10.1037/1082-989X.12.2.205
Huebner, A., & Lucht, M. (2019). Generalizability theory in R. Practical Assessment, Research & Evaluation, 24(5), 2. https://doi.org/10.7275/5065-gc10
Hunsley, J., Lee, C. M., Wood, J. M., & Taylor, W. (2015). Controversial and questionable assessment techniques. In S. O. Lilienfeld, S. J. Lynn, & J. M. Lohr (Eds.), Science and pseudoscience in clinical psychology (2nd ed., pp. 42–82). The Guilford Press.
Hunsley, J., & Mash, E. J. (2007). Evidence-based assessment. Annual Review of Clinical Psychology, 3, 29–51. https://doi.org/10.1146/annurev.clinpsy.3.022806.091419
Hurlburt, R. T. (1997). Randomly sampling thinking in the natural environment. Journal of Consulting and Clinical Psychology, 65(6), 941–949. https://doi.org/10.1037/0022-006X.65.6.941
Hussong, A. M., Bauer, D. J., Giordano, M. L., & Curran, P. J. (2020). Harmonizing altered measures in integrative data analysis: A methods analogue study. Behavior Research Methods. https://doi.org/10.3758/s13428-020-01472-7
Hussong, A. M., Curran, P. J., & Bauer, D. J. (2013). Integrative data analysis in clinical psychology research. Annual Review of Clinical Psychology, 9(1), 61–89. https://doi.org/10.1146/annurev-clinpsy-050212-185522
Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice (2nd ed.). OTexts.
Jensen, A. R. (1980). Précis of bias in mental testing. Behavioral and Brain Sciences, 3(3), 325–333. https://doi.org/10.1017/S0140525X00005161
Jiang, Z. (2018). Using the linear mixed-effect model framework to estimate generalizability variance components in R. Methodology, 14(3), 133–142. https://doi.org/10.1027/1614-2241/a000149
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953
Johnson, J. E. V., & Bruce, A. C. (2001). Calibration of subjective probability judgments in a naturalistic setting. Organizational Behavior and Human Decision Processes, 85(2), 265–290. https://doi.org/10.1006/obhd.2000.2949
Johnson, P. E. (2022). rockchalk: Regression estimation and presentation. https://CRAN.R-project.org/package=rockchalk
Jonson, J. L., & Geisinger, K. F. (2022). Fairness in educational and psychological testing: Examining theoretical, research, practice, and policy implications of the 2014 standards. American Educational Research Association,.
Jorgensen, T. D., Kite, B. A., Chen, P.-Y., & Short, S. D. (2018). Permutation randomization methods for testing measurement equivalence and detecting differential item functioning in multiple-group confirmatory factor analysis. Psychological Methods, 23(4), 708–728. https://doi.org/10.1037/met0000152
Jorgensen, T. D., Pornprasertmanit, S., Schoemann, A. M., & Rosseel, Y. (2021). semTools: Useful tools for structural equation modeling. https://github.com/simsem/semTools/wiki
Kagan, J. (1969). The three faces of continuity in human development. In D. A. Goslin (Ed.), Handbook of socialization theory and research (pp. 983–1002). Rand McNally.
Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus, and Giroux.
Kazdin, A. E. (1995). Preparing and evaluating research reports. Psychological Assessment, 7(3), 228–237. https://doi.org/10.1037/1040-3590.7.3.228
Kelley, K., & Pornprasertmanit, S. (2016). Confidence intervals for population reliability coefficients: Evaluation of methods, recommendations, and software for composite measures. Psychological Methods, 21(1), 69–92. https://doi.org/10.1037/a0040086
Keren, G. (1987). Facing uncertainty in the game of bridge: A calibration study. Organizational Behavior and Human Decision Processes, 39(1), 98–114. https://doi.org/10.1016/0749-5978(87)90047-1
Kessler, R. C., Bossarte, R. M., Luedtke, A., Zaslavsky, A. M., & Zubizarreta, J. R. (2020). Suicide prediction models: A critical review of recent research with recommendations for the way forward. Molecular Psychiatry, 25(1), 168–179. https://doi.org/10.1038/s41380-019-0531-0
Kievit, R. A., Brandmaier, A. M., Ziegler, G., Harmelen, A.-L. van, Mooij, S. M. M. de, Moutoussis, M., Goodyer, I., Bullmore, E., Jones, P. B., Fonagy, P., Lindenberger, U., & Dolan, R. J. (2018). Developmental cognitive neuroscience using latent change score models: A tutorial and applications. Developmental Cognitive Neuroscience, 33, 99–117. https://doi.org/10.1016/j.dcn.2017.11.007
Kievit, R., Frankenhuis, W., Waldorp, L., & Borsboom, D. (2013). Simpson’s paradox in psychological science: A practical guide. Frontiers in Psychology, 4(513). https://doi.org/10.3389/fpsyg.2013.00513
Klein, D. F., & Cleary, T. A. (1969). Platonic true scores: Further comment. Psychological Bulletin, 71(4), 278–280. https://doi.org/10.1037/h0026852
Kline, R. B. (2023). Principles and practice of structural equation modeling (5th ed.). Guilford Publications.
Kline, R. B. (2024). How to evaluate local fit (residuals) in large structural equation models. International Journal of Psychology, 59(6), 1293–1306. https://doi.org/10.1002/ijop.13252
Koehler, D. J., Brenner, L., & Griffin, D. (2002). The calibration of expert judgment: Heuristics and biases beyond the laboratory. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment. Cambridge University Press.
Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Human Learning and Memory, 6(2), 107–118. https://doi.org/10.1037/0278-7393.6.2.107
Korotitsch, W. J., & Nelson-Gray, R. O. (1999). An overview of self-monitoring research in assessment and treatment. Psychological Assessment, 11(4), 415–425. https://doi.org/10.1037/1040-3590.11.4.415
Kotov, R., Krueger, R. F., Watson, D., Achenbach, T. M., Althoff, R. R., Bagby, R. M., Brown, T. A., Carpenter, W. T., Caspi, A., Clark, L. A., Eaton, N. R., Forbes, M. K., Forbush, K. T., Goldberg, D., Hasin, D., Hyman, S. E., Ivanova, M. Y., Lynam, D. R., Markon, K., … Zimmerman, M. (2017). The hierarchical taxonomy of psychopathology (HiTOP): A dimensional alternative to traditional nosologies. Journal of Abnormal Psychology, 126(4), 454–477. https://doi.org/10.1037/abn0000258
Kotov, R., Krueger, R. F., Watson, D., Cicero, D. C., Conway, C. C., DeYoung, C. G., Eaton, N. R., Forbes, M. K., Hallquist, M. N., Latzman, R. D., Mullins-Sweatt, S. N., Ruggero, C. J., Simms, L. J., Waldman, I. D., Waszczuk, M. A., & Wright, A. G. C. (2021). The hierarchical taxonomy of psychopathology (HiTOP): A quantitative nosology based on consensus of evidence. Annual Review of Clinical Psychology, 17(1), 83–108. https://doi.org/10.1146/annurev-clinpsy-081219-093304
Kozak, M. J., & Cuthbert, B. N. (2016). The NIMH research domain criteria initiative: Background, issues, and pragmatics. Psychophysiology, 53(3), 286–297. https://doi.org/10.1111/psyp.12518
Kriegman, L. S., & Kriegman, G. (1965). The PaTE report: A new psychodynamic and therapeutic evaluative procedure. The Psychiatric Quarterly, 39(1), 646–674. https://doi.org/10.1007/BF01569493
Krosnick, J. A. (1999). Survey research. Annual Review of Psychology, 50, 537–567. https://doi.org/10.1146/annurev.psych.50.1.537
Krueger, R. F., Nichol, P. E., Hicks, B. M., Markon, K. E., Patrick, C. J., lacono, W. G., & McGue, M. (2004). Using latent trait modeling to conceptualize an alcohol problems continuum. Psychological Assessment, 16(2), 107–119. https://doi.org/10.1037/1040-3590.16.2.107
Kuhn, M. (2022). caret: Classification and regression training. https://github.com/topepo/caret/
Kuncel, N. R., & Hezlett, S. A. (2010). Fact and fiction in cognitive ability testing for admissions and hiring decisions. Current Directions in Psychological Science, 19(6), 339–345. https://doi.org/10.1177/0963721410389459
Kundu, S., Aulchenko, Y. S., & Janssens, A. C. J. W. (2020). PredictABEL: Assessment of risk prediction models. https://CRAN.R-project.org/package=PredictABEL
Lai, M. H. C. (2021). Adjusting for measurement noninvariance with alignment in growth modeling. Multivariate Behavioral Research, 1–18. https://doi.org/10.1080/00273171.2021.1941730
Larson, M. J., & Carbine, K. A. (2017). Sample size calculations in human electrophysiology (EEG and ERP) studies: A systematic review and recommendations for increased rigor. International Journal of Psychophysiology, 111, 33–41. https://doi.org/10.1016/j.ijpsycho.2016.06.015
Lee, K., Bull, R., & Ho, R. M. H. (2013). Developmental changes in executive functioning. Child Development, 84(6), 1933–1953. https://doi.org/10.1111/cdev.12096
Lee Meeuw Kjoe, P. R., Agelink van Rentergem, J. A., Vermeulen, I. E., & Schagen, S. B. (2021). How to correct for computer experience in online cognitive testing? Assessment, 28(5), 1247–1255. https://doi.org/10.1177/1073191120911098
Lek, K. M., & Van De Schoot, R. (2018). A comparison of the single, conditional and person-specific standard error of measurement: What do they measure and when to use them? Frontiers in Applied Mathematics and Statistics, 4(40). https://doi.org/10.3389/fams.2018.00040
Lele, S. R., Keim, J. L., & Solymos, P. (2019). ResourceSelection: Resource selection (probability) functions for use-availability data. https://github.com/psolymos/ResourceSelection
Leong, F. T. L., & Kalibatseva, Z. (2016). Threats to cultural validity in clinical diagnosis and assessment: Illustrated with the case of Asian Americans. In N. Zane, G. Bernal, & F. T. L. Leong (Eds.), Evidence-based psychological practice with ethnic minorities: Culturally informed research and clinical strategies (pp. 57–74). American Psychological Association.
Lewis-Fernández, R., Aggarwal, N. K., Bäärnhielm, S., Rohlof, H., Kirmayer, L. J., Weiss, M. G., Jadhav, S., Hinton, L., Alarcón, R. D., Bhugra, D., Groen, S., Dijk, R. van, Qureshi, A., Collazos, F., Rousseau, C., Caballero, L., Ramos, M., & Lu, F. (2014). Culture and psychiatric evaluation: Operationalizing cultural formulation for DSM-5. Psychiatry: Interpersonal and Biological Processes, 77(2), 130–154. https://doi.org/10.1521/psyc.2014.77.2.130
Lilienfeld, S. O. (2007). Psychological treatments that cause harm. Perspectives on Psychological Science, 2(1), 53–70. https://doi.org/10.1111/j.1745-6916.2007.00029.x
Lilienfeld, S. O. (2017). Psychology’s replication crisis and the grant culture: Righting the ship. Perspectives on Psychological Science, 12(4), 660–664. https://doi.org/10.1177/1745691616687745
Lilienfeld, S. O., Sauvigne, K., Lynn, S. J., Latzman, R. D., Cautin, R., & Waldman, I. D. (2015). Fifty psychological and psychiatric terms to avoid: A list of inaccurate, misleading, misused, ambiguous, and logically confused words and phrases. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.01100
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The scientific status of projective techniques. Psychological Science in the Public Interest, 1, 27–66. https://doi.org/10.1111/1529-1006.002
Lindhiem, O., Petersen, I. T., Mentch, L. K., & Youngstrom, E. A. (2020). The importance of calibration in clinical psychology. Assessment, 27(4), 840–854. https://doi.org/10.1177/1073191117752055
Lindhiem, O., Yu, L., Grasso, D. J., Kolko, D. J., & Youngstrom, E. A. (2015). Adapting the posterior probability of diagnosis index to enhance evidence-based screening: An application to ADHD in primary care. Assessment, 22(2), 198–207. https://doi.org/10.1177/1073191114540748
Lindzey, G. (1952). Thematic apperception test: Interpretive assumptions and related empirical evidence. Psychological Bulletin, 49, 1–25. https://doi.org/10.1037/h0062363
Little, T. D. (2013). Longitudinal structural equation modeling. The Guilford Press.
Little, T. D., Preacher, K. J., Selig, J. P., & Card, N. A. (2007). New developments in latent variable panel analyses of longitudinal data. International Journal of Behavioral Development, 31(4), 357–365. https://doi.org/10.1177/0165025407077757
Little, T. D., Slegers, D. W., & Card, N. A. (2006). A non-arbitrary method of identifying and scaling latent variables in SEM and MACS models. Structural Equation Modeling, 13(1), 59–72. https://doi.org/10.1207/s15328007sem1301_3
Liu, Y., Millsap, R. E., West, S. G., Tein, J.-Y., Tanaka, R., & Grimm, K. J. (2017). Testing measurement invariance in longitudinal data with ordered-categorical measures. Psychological Methods, 22(3), 486–506. https://doi.org/10.1037/met0000075
Lobbestael, J., Leurgans, M., & Arntz, A. (2011). Inter-rater reliability of the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID I) and Axis II Disorders (SCID II). Clinical Psychology & Psychotherapy, 18(1), 75–79. https://doi.org/10.1002/cpp.693
Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(3), 635–694. https://doi.org/10.2466/pr0.1957.3.3.635
Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584–585. https://doi.org/10.1126/science.aal3618
Lubke, G. H., McArtor, D. B., Boomsma, D. I., & Bartels, M. (2018). Genetic and environmental contributions to the development of childhood aggression. Developmental Psychology, 54(1), 39–50. https://doi.org/10.1037/dev0000403
Lüdecke, D., Ben-Shachar, M. S., Patil, I., Waggoner, P., & Makowski, D. (2021). performance: An R package for assessment, comparison and testing of statistical models. Journal of Open Source Software, 6(60), 3139. https://doi.org/10.21105/joss.03139
Lupien, S. J., Sasseville, M., François, N., Giguère, C. E., Boissonneault, J., Plusquellec, P., Godbout, R., Xiong, L., Potvin, S., Kouassi, E., & Lesage, A. (2017). The DSM5/RDoC debate on the future of mental health research: Implication for studies on human stress and presentation of the signature bank. Stress, 20(1), 2–18. https://doi.org/10.1080/10253890.2017.1286324
Lutz, W., Schwartz, B., & Delgadillo, J. (2022). Measurement-based and data-informed psychological therapy. Annual Review of Clinical Psychology, 18(1), 71–98. https://doi.org/10.1146/annurev-clinpsy-071720-014821
Lysell, H., Dahlin, M., Viktorin, A., Ljungberg, E., D’Onofrio, B. M., Dickman, P., & Runeson, B. (2018). Maternal suicide – register based study of all suicides occurring after delivery in sweden 1974–2009. PLOS ONE, 13(1), e0190133. https://doi.org/10.1371/journal.pone.0190133
MacCallum, R. C., & Austin, J. T. (2000). Applications of structural equation modeling in psychological research. Annual Review of Psychology, 51(1), 201–226. https://doi.org/10.1146/annurev.psych.51.1.201
Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37(4), 304–315. https://doi.org/10.1177/0146621613475471
Makridakis, S., Hogarth, R. M., & Gaba, A. (2009). Forecasting and uncertainty in the economic and business world. International Journal of Forecasting, 25(4), 794–812. https://doi.org/10.1016/j.ijforecast.2009.05.012
Manly, J. J. (2005). Advantages and disadvantages of separate norms for African Americans. The Clinical Neuropsychologist, 19(2), 270–275. https://doi.org/10.1080/13854040590945346
Manly, J. J., & Echemendia, R. J. (2007). Race-specific norms: Using the model of hypertension to understand issues of race, culture, and education in neuropsychology. Archives of Clinical Neuropsychology, 22(3), 319–325. https://doi.org/10.1016/j.acn.2007.01.006
Markon, K. E. (2019). Bifactor and hierarchical models: Specification, inference, and interpretation. Annual Review of Clinical Psychology, 15(1), 51–69. https://doi.org/10.1146/annurev-clinpsy-050718-095522
Markon, K. E., Chmielewski, M., & Miller, C. J. (2011). The reliability and validity of discrete and continuous measures of psychopathology: A quantitative review. Psychological Bulletin, 137(5), 856–879. https://doi.org/10.1037/a0023678
Markus, K. A. (2018). Three conceptual impediments to developing scale theory for formative scales. Methodology, 14(4), 156–164. https://doi.org/10.1027/1614-2241/a000154
Marsh, H. W., Morin, A. J. S., Parker, P. D., & Kaur, G. (2014). Exploratory structural equation modeling: An integration of the best features of exploratory and confirmatory factor analysis. Annual Review of Clinical Psychology, 10(1), 85–110. https://doi.org/10.1146/annurev-clinpsy-032813-153700
Masche, J. G., & Dulmen, M. H. M. van. (2004). Advances in disentangling age, cohort, and time effects: No quadrature of the circle, but a help. Developmental Review, 24(3), 322–342. https://doi.org/10.1016/j.dr.2004.04.002
Matthews, M., Abdullah, S., Murnane, E., Voida, S., Choudhury, T., Gay, G., & Frank, E. (2016). Development and evaluation of a smartphone-based measure of social rhythms for bipolar disorder. Assessment, 23(4), 472–483. https://doi.org/10.1177/1073191116656794
McArdle, J. J., & Grimm, K. J. (2011). An empirical example of change analysis by linking longitudinal item response data from multiple tests. In A. A. von Davier (Ed.), Statistical models for test equating, scaling, and linking (pp. 71–88). Springer Science & Business Media.
McArdle, J. J., Grimm, K. J., Hamagami, F., Bowles, R. P., & Meredith, W. (2009). Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement. Psychological Methods, 14(2), 126–149. https://doi.org/10.1037/a0015857
McClelland, D. C. (1973). Testing for competence rather than for “intelligence.” American Psychologist, 28, 1–14. https://doi.org/10.1037/h0034092
McClelland, D. C. (1994). The knowledge-testing-educational complex strikes back. American Psychologist, 49(1), 66–69. https://doi.org/10.1037/0003-066X.49.1.66
McFall, R. M. (1991). Manifesto for a science of clinical psychology. The Clinical Psychologist, 44(6), 75–91.
McFall, R. M. (2000). Elaborate reflections on a simple manifesto. Applied & Preventive Psychology, 9(1), 5–21. https://doi.org/10.1016/s0962-1849(05)80035-6
McNally, R. J. (2021). Network analysis of psychopathology: Controversies and challenges. Annual Review of Clinical Psychology, 17(1), 31–53. https://doi.org/10.1146/annurev-clinpsy-081219-092850
McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23(3), 412–433. https://doi.org/10.1037/met0000144
McNeish, D., & Wolf, M. G. (2023). Dynamic fit index cutoffs for confirmatory factor analysis models. Psychological Methods, 28(1), 61–88. https://doi.org/10.1037/met0000425
McNiel, D. E., & Binder, R. L. (1995). Correlates of accuracy in the assessment of psychiatric inpatients’ risk of violence. American Journal of Psychiatry, 152(6), 901–906. https://doi.org/10.1176/ajp.152.6.901
Meade, A. W. (2010). A taxonomy of effect size measures for the differential functioning of items and scales. Journal of Applied Psychology, 95(4), 728–743. https://doi.org/10.1037/a0018966
Meehl, P. E. (1957). When shall we use our heads instead of the formula? Journal of Counseling Psychology, 4(4), 268–273. https://doi.org/10.1037/h0047554
Meehl, P. E. (1978). Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. Journal of Consulting and Clinical Psychology, 46(4), 806–834. https://doi.org/10.1037/0022-006x.46.4.806
Meehl, P. E. (1986). Causes and effects of my disturbing little book. Journal of Personality Assessment, 50(3), 370–375. https://doi.org/10.1207/s15327752jpa5003_6
Meehl, P. E., & Rosen, A. (1955). Antecedent probability and the efficiency of psychometric signs, patterns, or cutting scores. Psychological Bulletin, 52(3), 194–216. https://doi.org/10.1037/h0048070
Melikyan, Z. A., Agranovich, A. V., & Puente, A. E. (2019). Fairness in psychological testing. In G. Goldstein, D. N. Allen, & J. DeLuca (Eds.), Handbook of psychological assessment (fourth edition) (pp. 551–572). Academic Press. https://doi.org/10.1016/B978-0-12-802203-0.00018-3
Metz, C. E., Goodenough, D. J., & Rossmann, K. (1973). Evaluation of receiver operating characteristic curve data in terms of information theory, with applications in radiography. Radiology, 109(2), 297–303. https://doi.org/10.1148/109.2.297
Meyer, G. J., Erard, R. E., Erdberg, P., Mihura, J. L., & Viglione, D. J. (2011). Rorschach Performance Assessment System: Administration, coding, interpretation, and technical manual. Rorschach Performance Asessement Systems LLC.
Miller, G. A., Elbert, T., Sutton, B. P., & Heller, W. (2007). Innovative clinical assessment technologies: Challenges and opportunities in neuroimaging. Psychological Assessment, 19(1), 58–73. https://doi.org/10.1037/1040-3590.19.1.58
Miller, G. A., Rockstroh, B. S., Hamilton, H. K., & Yee, C. M. (2016). Psychophysiology as a core strategy in RDoC. Psychophysiology, 53(3), 410–414. https://doi.org/10.1111/psyp.12581
Miller, J. B., & Sanjurjo, A. (2014). A cold shower for the hot hand fallacy. Innocenzo Gasparini Institute for Economic Research. https://repec.unibocconi.it/igier/igi/wp/2014/518.pdf
Miller, J. L., Vaillancourt, T., & Boyle, M. H. (2009). Examining the heterotypic continuity of aggression using teacher reports: Results from a national Canadian study. Social Development, 18(1), 164–180. https://doi.org/10.1111/j.1467-9507.2008.00480.x
Millsap, R. E. (2011). Statistical approaches to measurement invariance. Taylor & Francis.
Moeller, J. (2015). A word on standardization in longitudinal studies: don’t. Frontiers in Psychology, 6(1389), 1–4. https://doi.org/10.3389/fpsyg.2015.01389
Moffitt, T. E. (1993). Adolescence-limited and life-course-persistent antisocial behavior: A developmental taxonomy. Psychological Review, 100(4), 674–701. https://doi.org/10.1037/0033-295X.100.4.674
Moffitt, T. E. (2006a). A review of research on the taxonomy of life-course persistent versus adolescence-limited antisocial behavior. Taking Stock: The Status of Criminological Theory, 15, 277–312.
Moffitt, T. E. (2006b). Life-course-persistent versus adolescence-limited antisocial behavior. In D. C. D. J. Cohen (Ed.), Developmental psychopathology, vol 3: Risk, disorder, and adaptation (2nd ed.) (pp. 570–598). John Wiley & Sons Inc.
Morgan, C. D., & Murray, H. A. (1935). A method for investigating fantasies: The thematic apperception test. Archives of Neurology & Psychiatry, 34(2), 289–306. https://doi.org/10.1001/archneurpsyc.1935.02250200049005
Morley, S. K., Brito, T. V., & Welling, D. T. (2018). Measures of model performance based on the log accuracy ratio. Space Weather, 16(1), 69–88. https://doi.org/10.1002/2017SW001669
Mullins-Sweatt, S. N., & Widiger, T. A. (2009). Clinical utility and DSM-V. Psychological Assessment, 21(3), 302–312. https://doi.org/10.1037/a0016607
Murphy, A. H., & Winkler, R. L. (1984). Probability forecasting in meterology. Journal of the American Statistical Association, 79(387), 489–500. https://doi.org/10.2307/2288395
Muthén, L. K., & Muthén, B. O. (2002). How to use a Monte Carlo study to decide on sample size and determine power. Structural Equation Modeling: A Multidisciplinary Journal, 9(4), 599–620. https://doi.org/10.1207/s15328007sem0904_8
Muthén, L. K., & Muthén, B. O. (2019). Mplus version 8.4. Muthén & Muthén.
Myers, K., & Winters, N. C. (2002). Ten-year review of rating scales. I: Overview of scale functioning, psychometric properties, and selection. Journal of the American Academy of Child & Adolescent Psychiatry, 41(2), 114–122. https://doi.org/10.1097/00004583-200202000-00004
Nagy, T. F. (2011). Essential ethics for psychologists: A primer for understanding and mastering core issues (pp. x, 252–x, 252). American Psychological Association.
Nelson-Gray, R. O. (2003). Treatment utility of psychological assessment. Psychological Assessment, 15(4), 521–531. https://doi.org/10.1037/1040-3590.15.4.521
Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84(3), 231–259. https://doi.org/10.1037/0033-295x.84.3.231
Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw-Hill.
Nye, C. D., Bradburn, J., Olenick, J., Bialko, C., & Drasgow, F. (2019). How big are my effects? Examining the magnitude of effect sizes in studies of measurement equivalence. Organizational Research Methods, 22(3), 678–709. https://doi.org/10.1177/1094428118761122
Oberski, D. L. (2014). Evaluating sensitivity of parameters of interest to measurement invariance in latent variable models. Political Analysis, 22(1), 45–60. https://doi.org/10.1093/pan/mpt014
Oberski, D. L., Vermunt, J. K., & Moors, G. B. D. (2015). Evaluating measurement invariance in categorical data latent variable models with the EPC-interest. Political Analysis, 23(4), 550–563. https://doi.org/10.1093/pan/mpv020
Okazaki, S., & Sue, S. (1995). Methodological issues in assessment research with ethnic minorities. Psychological Assessment, 7(3), 367–375. https://doi.org/10.1037/1040-3590.7.3.367
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251). https://doi.org/10.1126/science.aac4716
Orth, U., Clark, D. A., Donnellan, M. B., & Robins, R. W. (2021). Testing prospective effects in longitudinal research: Comparing seven competing cross-lagged models. Journal of Personality and Social Psychology, 120(4), 1013–1034. https://doi.org/10.1037/pspp0000358
Oskamp, S. (1965). Overconfidence in case-study judgments. Journal of Consulting Psychology, 29(3), 261–265. https://doi.org/10.1037/h0022125
Park, D. C., & Bischof, G. N. (2013). The aging mind: Neuroplasticity in response to cognitive training. Dialogues in Clinical Neuroscience, 15(1), 109–119. https://doi.org/10.31887/DCNS.2013.15.1/dpark
Patrick, C. J., Iacono, W. G., & Venables, N. C. (2019). Incorporating neurophysiological measures into clinical assessments: Fundamental challenges and a strategy for addressing them. Psychological Assessment, 31(7), 952–960. https://doi.org/10.1037/pas0000713
Patterson, G. R. (1993). Orderly change in a stable world: The antisocial trait as a chimera. Journal of Consulting and Clinical Psychology, 61(6), 911–919. https://doi.org/10.1037/0022-006X.61.6.911
Paulus, J. K., & Kent, D. M. (2020). Predictably unequal: Understanding and addressing concerns that algorithmic clinical prediction may increase health disparities. Npj Digital Medicine, 3(1), 99. https://doi.org/10.1038/s41746-020-0304-9
Pearl, J. (2013). Linear models: A useful “microscope" for causal analysis. Journal of Causal Inference, 1(1), 155–170. https://doi.org/10.1515/jci-2013-0003
Peters, G.-J. (2014). The alpha and the omega of scale reliability and validity: Why and how to abandon Cronbach’s alpha and the route towards more comprehensive assessment of scale quality. European Health Psychologist, 16(2), 56–69.
Petersen, I. T. (2024a). Assessing externalizing behaviors in school-aged children: Implications for school and community providers. https://scsmh.education.uiowa.edu/practice-brief/assessing-externalizing-behaviors-in-school-aged-children-implications-for-school-and-community-providers/
Petersen, I. T. (2024b). petersenlab: A collection of R functions by the Petersen Lab. https://doi.org/10.32614/CRAN.package.petersenlab
Petersen, I. T. (in press). Reexamining developmental continuity and discontinuity in the 21st century: Better aligning behaviors, functions, and mechanisms. Developmental Psychology. https://doi.org/10.1037/dev0001657
Petersen, I. T., Apfelbaum, K. S., & McMurray, B. (2024). Adapting open science and pre-registration to longitudinal research. Infant and Child Development, 33(1), e2315. https://doi.org/10.1002/icd.2315
Petersen, I. T., Bates, J. E., D’Onofrio, B. M., Coyne, C. A., Lansford, J. E., Dodge, K. A., Pettit, G. S., & Van Hulle, C. A. (2013). Language ability predicts the development of behavior problems in children. Journal of Abnormal Psychology, 122(2), 542–557. https://doi.org/10.1037/a0031963
Petersen, I. T., Bates, J. E., Dodge, K. A., Lansford, J. E., & Pettit, G. S. (2015). Describing and predicting developmental profiles of externalizing problems from childhood to adulthood. Development and Psychopathology, 27(3), 791–818. https://doi.org/10.1017/S0954579414000789
Petersen, I. T., Bates, J. E., McQuillan, M. E., Hoyniak, C. P., Staples, A. D., Rudasill, K. M., Molfese, D. L., & Molfese, V. J. (2021). Heterotypic continuity of inhibitory control in early childhood: Evidence from four widely used measures. Developmental Psychology, 57(11), 1755–1771. https://doi.org/10.1037/dev0001025
Petersen, I. T., Choe, D. E., & LeBeau, B. (2020). Studying a moving target in development: The challenge and opportunity of heterotypic continuity. Developmental Review, 58, 100935. https://doi.org/10.1016/j.dr.2020.100935
Petersen, I. T., Hoyniak, C. P., McQuillan, M. E., Bates, J. E., & Staples, A. D. (2016). Measuring the development of inhibitory control: The challenge of heterotypic continuity. Developmental Review, 40, 25–71. https://doi.org/10.1016/j.dr.2016.02.001
Petersen, I. T., & LeBeau, B. (2021). Language ability in the development of externalizing behavior problems in childhood. Journal of Educational Psychology, 113(1), 68–85. https://doi.org/10.1037/edu0000461
Petersen, I. T., & LeBeau, B. (2022). Creating a developmental scale to chart the development of psychopathology with different informants and measures across time. Journal of Psychopathology and Clinical Science, 131(6), 611–625. https://doi.org/10.1037/abn0000649
Petersen, I. T., LeBeau, B., & Choe, D. E. (2021). Creating a developmental scale to account for heterotypic continuity in development: A simulation study. Child Development, 92(1), e1–e19. https://doi.org/10.1111/cdev.13433
Petersen, I. T., Lindhiem, O., LeBeau, B., Bates, J. E., Pettit, G. S., Lansford, J. E., & Dodge, K. A. (2018). Development of internalizing problems from adolescence to emerging adulthood: Accounting for heterotypic continuity with vertical scaling. Developmental Psychology, 54(3), 586–599. https://doi.org/10.1037/dev0000449
Petscher, Y., Justice, L. M., & Hogan, T. (2018). Modeling the early language trajectory of language development when the measures change and its relation to poor reading comprehension. Child Development, 89(6), 2136–2156. https://doi.org/10.1111/cdev.12880
Piasecki, T. M., Hufford, M. R., Solhan, M., & Trull, T. J. (2007). Assessing clients in their natural environments with electronic diaries: Rationale, benefits, limitations, and barriers. Psychological Assessment, 19(1), 25–43. https://doi.org/10.1037/1040-3590.19.1.25
Podsakoff, P. M., MacKenzie, S. B., & Podsakoff, N. P. (2012). Sources of method bias in social science research and recommendations on how to control it. Annual Review of Psychology, 63(1), 539–569. https://doi.org/10.1146/annurev-psych-120710-100452
Pornprasertmanit, S., Miller, P., Schoemann, A., & Jorgensen, T. D. (2021). simsem: SIMulated structural equation modeling. http://www.simsem.org
Putnam, S. P., Rothbart, M. K., & Gartstein, M. A. (2008). Homotypic and heterotypic continuity of fine-grained temperament during infancy, toddlerhood, and early childhood. Infant & Child Development, 17(4), 387–405. https://doi.org/10.1002/ICD.582
Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004
R Core Team. (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
Raiche, G., & Magis, D. (2020). nFactors: Parallel analysis and other non graphical solutions to the Cattell scree test. https://CRAN.R-project.org/package=nFactors
Raugh, I. M., Chapman, H. C., Bartolomeo, L. A., Gonzalez, C., & Strauss, G. P. (2019). A comprehensive review of psychophysiological applications for ecological momentary assessment in psychiatric populations. Psychological Assessment, 31(3), 304–317. https://doi.org/10.1037/pas0000651
Raykov, T. (2001). Bias of coefficient \(\alpha\) for fixed congeneric measures with correlated errors. 25(1), 69–76. https://doi.org/10.1177/01466216010251005
Raykov, T., & Marcoulides, G. A. (2001). Can there be infinitely many models equivalent to a given covariance structure model? Structural Equation Modeling: A Multidisciplinary Journal, 8(1), 142–149. https://doi.org/10.1207/S15328007SEM0801_8
Raykov, T., & Marcoulides, G. A. (2019). Thanks coefficient alpha, we still need you! Educational and Psychological Measurement, 79(1), 200–210. https://doi.org/10.1177/0013164417725127
Raykov, T., Marcoulides, G. A., Harrison, M., & Zhang, M. (2020). On the dependability of a popular procedure for studying measurement invariance: A cause for concern? Structural Equation Modeling: A Multidisciplinary Journal, 27(4), 649–656. https://doi.org/10.1080/10705511.2019.1610409
Reise, S. P., & Waller, N. G. (2009). Item response theory and clinical measurement. Annual Review of Clinical Psychology, 5(1), 27–48. https://doi.org/10.1146/annurev.clinpsy.032408.153553
Revelle, W. (2022). psych: Procedures for psychological, psychometric, and personality research. https://personality-project.org/r/psych/
Revelle, W., & Condon, D. M. (2019). Reliability from \(\alpha\) to \(\omega\): A tutorial. Psychological Assessment, 31(12), 1395–1411. https://doi.org/10.1037/pas0000754
Revelle, W., & Rocklin, T. (1979). Very simple structure: An alternative procedure for estimating the optimal number of interpretable factors. Multivariate Behavioral Research, 14(4), 403–414. https://doi.org/10.1207/s15327906mbr1404_2
Reynolds, C. R., Altmann, R. A., & Allen, D. N. (2021). The problem of bias in psychological assessment. In C. R. Reynolds, R. A. Altmann, & D. N. Allen (Eds.), Mastering modern psychological testing: Theory and methods (pp. 573–613). Springer International Publishing. https://doi.org/10.1007/978-3-030-59455-8_15
Reynolds, C. R., & Suzuki, L. A. (2012). Bias in psychological assessment: An empirical review and recommendations. In I. B. Weiner, J. R. Graham, & J. A. Naglieri (Eds.), Handbook of psychology, Vol. 10: Assessment psychology, Part 1: Assessment issues (2nd ed., pp. 82–113).
Rice, M. E., Harris, G. T., & Lang, C. (2013). Validation of and revision to the VRAG and SORAG: The Violence Risk Appraisal Guide—Revised (VRAG-R). Psychological Assessment, 25(3), 951–965. https://doi.org/10.1037/a0032878
Ridley, C. R., Hill, C. L., & Wiese, D. L. (2001). Ethics in multicultural assessment a model of reasoned application. In D. L. Wiese (Ed.), Handbook of multicultural assessment: Clinical, psychological, and educational applications (p. 29).
Ridley, C. R., Li, L. C., & Hill, C. L. (1998). Multicultural assessment: Reexamination, reconceptualization, and practical application. The Counseling Psychologist, 26(6), 827–910. https://doi.org/10.1177/0011000098266001
Rigdon, E. E. (2010). Polychoric correlation coefficient. In N. J. Salkind (Ed.), Encyclopedia of research design. SAGE Publications. https://doi.org/10.4135/9781412961288
Rivera Mindt, M., Byrd, D., Saez, P., & Manly, J. (2010). Increasing culturally competent neuropsychological services for ethnic minority populations: A call to action. The Clinical Neuropsychologist, 24(3), 429–453. https://doi.org/10.1080/13854040903058960
Roberts, A. C., Yeap, Y. W., Seah, H. S., Chan, E., Soh, C.-K., & Christopoulos, G. I. (2019). Assessing the suitability of virtual reality for psychological testing. Psychological Assessment, 31(3), 318–328. https://doi.org/10.1037/pas0000663
Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J.-C., & Müller, M. (2021). pROC: Display and analyze ROC curves. http://expasy.org/tools/pROC/
Robitzsch, A. (2019). mnlfa: Moderated nonlinear factor analysis. https://CRAN.R-project.org/package=mnlfa
Rodebaugh, T. L., Scullin, R. B., Langer, J. K., Dixon, D. J., Huppert, J. D., Bernstein, A., Zvielli, A., & Lenze, E. J. (2016). Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias. Journal of Abnormal Psychology, 125(6), 840–851. https://doi.org/10.1037/abn0000184
Roemer, E., Schuberth, F., & Henseler, J. (2021). HTMT2–an improved criterion for assessing discriminant validity in structural equation modeling. Industrial Management & Data Systems, 121(12), 2637–2650. https://doi.org/10.1108/IMDS-02-2021-0082
Rönkkö, M., & Cho, E. (2020). An updated guideline for assessing discriminant validity. Organizational Research Methods, 1094428120968614. https://doi.org/10.1177/1094428120968614
Rosseel, Y., Jorgensen, T. D., & Rockwood, N. (2022). lavaan: Latent variable analysis. https://lavaan.ugent.be
Royal, K. (2016). “Face validity” is not a legitimate type of validity evidence! The American Journal of Surgery, 212(5), 1026–1027. https://doi.org/10.1016/j.amjsurg.2016.02.018
Ruiz, M. A., Drake, E. B., Glass, A., Marcotte, D., & Gorp, W. G. van. (2002). Trying to beat the system: Misuse of the internet to assist in avoiding the detection of psychological symptom dissimulation. Professional Psychology: Research and Practice, 33(3), 294–299. https://doi.org/10.1037/0735-7028.33.3.294
Ruscio, J., & Roche, B. (2012). Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure. Psychological Assessment, 24(2), 282–292. https://doi.org/10.1037/a0025697
Rush, A. J., First, M. B., & Blacker, D. (2009). Handbook of psychiatric measures. American Psychiatric Publishing.
Rushton, J. P., Brainerd, C. J., & Pressley, M. (1983). Behavioral development and construct validity: The principle of aggregation. Psychological Bulletin, 94(1), 18–38. https://doi.org/10.1037/0033-2909.94.1.18
Russo, J. E., & Schoemaker, P. J. (1992). Managing overconfidence. Sloan Management Review, 33(2), 7.
Sackett, P. R., Borneman, M. J., & Connelly, B. S. (2008). High stakes testing in higher education and employment: Appraising the evidence for validity and fairness. American Psychologist, 63, 215–227. https://doi.org/10.1037/0003-066X.63.4.215
Sackett, P. R., Schmitt, N., Ellingson, J. E., & Kabin, M. B. (2001). High-stakes testing in employment, credentialing, and higher education. American Psychologist, 56, 301–318. https://doi.org/10.1037/0003-066X.56.4.302
Sackett, P. R., & Wilk, S. L. (1994). Within-group norming and other forms of score adjustment in preemployment testing. American Psychologist, 49(11), 929–954. https://doi.org/10.1037/0003-066X.49.11.929
Sarstedt, M., Adler, S. J., Ringle, C. M., Cho, G., Diamantopoulos, A., Hwang, H., & Liengaard, B. D. (2024). Same model, same data, but different outcomes: Evaluating the impact of method choices in structural equation modeling. Journal of Product Innovation Management, 41(6), 1100–1117. https://doi.org/10.1111/jpim.12738
Sattler, J. M., & Hoge, R. D. (2006). Assessment of children: Behavioral, social, and clinical foundations (5th ed.). Jerome M. Sattler, Publisher, Inc.
Schaefer, J. D., Caspi, A., Belsky, D. W., Harrington, H., Houts, R., Horwood, L. J., Hussong, A., Ramrakha, S., Poulton, R., & Moffitt, T. E. (2017). Enduring mental health: Prevalence and prediction. Journal of Abnormal Psychology, 126(2), 212–224. https://doi.org/10.1037/abn0000232
Schaie, K. W. (1965). A general model for the study of developmental problems. Psychological Bulletin, 64(2), 92–107. https://doi.org/10.1037/h0022371
Schaie, K. W. (2005). Developmental influences on adult intelligence: The Seattle longitudinal study. Oxford University Press.
Schaie, K. W., & Baltes, P. B. (1975). On sequential strategies in developmental research. Human Development, 18(5), 384–390. https://doi.org/10.1159/000271498
Schmidt, F. L., & Hunter, J. E. (1981). Employment testing: Old theories and new research findings. American Psychologist, 36(10), 1128–1137. https://doi.org/10.1037/0003-066X.36.10.1128
Schmidt, F. L., & Hunter, J. E. (1996). Measurement error in psychological research: Lessons from 26 research scenarios. Psychological Methods, 1(2), 199–223. https://doi.org/10.1037/1082-989X.1.2.199
Schneider, W. J. (2021). simstandard: Generate standardized data. https://github.com/wjschne/simstandard
Schreiber, J. B., Nora, A., Stage, F. K., Barlow, E. A., & King, J. (2006). Reporting structural equation modeling and confirmatory factor analysis results: A review. Journal of Educational Research, 99(6), 323–337. https://doi.org/10.3200/JOER.99.6.323-338
Schuberth, F. (2023). The Henseler-Ogasawara specification of composites in structural equation modeling: A tutorial. Psychological Methods, 28(4), 843–859. https://doi.org/10.1037/met0000432
Schulenberg, J. E., & Maslowsky, J. (2009). Taking substance use and development seriously: Developmentally distal and proximal influences on adolescence drug use. Monographs of the Society for Research in Child Development, 74(3), 121–130. https://doi.org/10.1111/j.1540-5834.2009.00544.x
Schulenberg, J. E., Patrick, M. E., Maslowsky, J., & Maggs, J. L. (2014). The epidemiology and etiology of adolescent substance use in developmental perspective. In M. Lewis & K. D. Rudolph (Eds.), Handbook of developmental psychopathology (pp. 601–620). Springer US.
Schulenberg, J. E., & Zarrett, N. R. (2006). Mental health during emerging adulthood: Continuity and discontinuity in courses, causes, and functions. In Emerging adults in america: Coming of age in the 21st century. (pp. 135–172). American Psychological Association.
Sechrest, L. (1963). Incremental validity: A recommendation. Educational and Psychological Measurement, 23, 153–158. https://doi.org/10.1177/001316446302300113
Sechrest, L., Stickle, T. R., & Stewart, M. (1998). The role of assessment in clinical psychology. In A. Bellack, M. Hersen, & C. R. Reynolds (Eds.), Comprehensive clinical psychology, Vol. 4: Assessment. Pergamon.
Sellbom, M. (2019). The MMPI-2-restructured form (MMPI-2-RF): Assessment of personality and psychopathology in the twenty-first century. Annual Review of Clinical Psychology, 15(1), 149–177. https://doi.org/10.1146/annurev-clinpsy-050718-095701
Sellbom, M., & Tellegen, A. (2019). Factor analysis in psychological assessment research: Common pitfalls and recommendations. Psychological Assessment, 31(12), 1428–1441. https://doi.org/10.1037/pas0000623
Sharp, K. L., Williams, A. J., Rhyner, K. T., & Ilardi, S. S. (2013). The clinical interview. In K. F. Geisinger, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 2: Testing and assessment in clinical and counseling psychology (pp. 103–117). American Psychological Association.
Shavelson, R. J., Webb, N. M., & Rawley, R. L. (1989). Generalizability theory. American Psychologist, 44, 922–932. https://doi.org/10.1037/0003-066X.44.6.922
Shiffman, S., Stone, A. A., & Hufford, M. R. (2008). Ecological momentary assessment. Annual Review of Clinical Psychology, 4, 1–32. https://doi.org/10.1146/annurev.clinpsy.3.022806.091415
Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis. Annual Review of Psychology, 69(1), 487–510. https://doi.org/10.1146/annurev-psych-122216-011845
Sijtsma, K. (2008). On the use, the misuse, and the very limited usefulness of cronbach’s alpha. Psychometrika, 74(1), 107. https://doi.org/10.1007/s11336-008-9101-0
Silver, N. (2012). The signal and the noise: Why so many predictions fail–but some don’t. Penguin.
Silverberg, N. D., & Millis, S. R. (2009). Impairment versus deficiency in neuropsychological assessment: Implications for ecological validity. Journal of the International Neuropsychological Society, 15(1), 94–102. https://doi.org/10.1017/S1355617708090139
Simms, L. J., Zelazny, K., Williams, T. F., & Bernstein, L. (2019). Does the number of response options matter? Psychometric perspectives using personality questionnaire data. Psychological Assessment, 31(4), 557–566. https://doi.org/10.1037/pas0000648
Skala, D. (2008). Overconfidence in psychology and finance–an interdisciplinary literature review. Bank i Kredyt, 4, 33–50.
Slack, M. K., & Draugalis, J., Jolaine R. (2001). Establishing the internal and external validity of experimental studies. American Journal of Health-System Pharmacy, 58(22), 2173–2181. https://doi.org/10.1093/ajhp/58.22.2173
Smedley, A., & Smedley, B. D. (2005). Race as biology is fiction, racism as a social problem is real: Anthropological and historical perspectives on the social construction of race. American Psychologist, 60(1), 16–26. https://doi.org/10.1037/0003-066X.60.1.16
Smith, G. T., Atkinson, E. A., Davis, H. A., Riley, E. N., & Oltmanns, J. R. (2020). The general factor of psychopathology. Annual Review of Clinical Psychology, 16(1), 75–98. https://doi.org/10.1146/annurev-clinpsy-071119-115848
Smith, G. T., McCarthy, D. M., & Anderson, K. G. (2000). On the sins of short-form development. Psychological Assessment, 12(1), 102–111. https://doi.org/10.1037/1040-3590.12.1.102
Sobell, L. C., & Sobell, M. B. (2008). Timeline followback (TLFB). In A. J. Rush Jr., M. B. First, & D. Blacker (Eds.), Handbook of psychiatric measures (2nd ed., pp. 466–468). American Psychiatric Publishing.
Sommers-Flanagan, J., & Sommers-Flanagan, R. (2016). Clinical interviewing. Wiley.
Somoza, E., Soutullo-Esperon, L., & Mossman, D. (1989). Evaluation and optimization of diagnostic tests using receiver operating characteristic analysis and information theory. International Journal of Bio-Medical Computing, 24(3), 153–189. https://doi.org/10.1016/0020-7101(89)90029-9
Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, & Computers, 31(1), 137–149. https://doi.org/10.3758/bf03207704
Stanton, K., McDonnell, C. G., Hayden, E. P., & Watson, D. (2020). Transdiagnostic approaches to psychopathology measurement: Recommendations for measure selection, data analysis, and participant recruitment. Journal of Abnormal Psychology, 129(1), 21–28. https://doi.org/10.1037/abn0000464
Staples, A. D., Bates, J. E., Petersen, I. T., McQuillan, M. E., & Hoyniak, C. (2019). Measuring sleep in young children and their mothers: Identifying actigraphic sleep composites. International Journal of Behavioral Development, 43(3), 278–285. https://doi.org/10.1177/0165025419830236
Sternberg, R. J., Grigorenko, E. L., & Kidd, K. K. (2005). Intelligence, race, and genetics. American Psychologist, 60(1), 46–59. https://doi.org/10.1037/0003-066x.60.1.46
Stevens, R. J., & Poppe, K. K. (2020). Validation of clinical prediction models: What does the “calibration slope” really measure? Journal of Clinical Epidemiology, 118, 93–99. https://doi.org/10.1016/j.jclinepi.2019.09.016
Steyerberg, E. W., & Vergouwe, Y. (2014). Towards better clinical prediction models: Seven steps for development and an ABCD for validation. European Heart Journal, 35(29), 1925–1931. https://doi.org/10.1093/eurheartj/ehu207
Steyerberg, E. W., Vickers, A. J., Cook, N. R., Gerds, T., Gonen, M., Obuchowski, N., Pencina, M. J., & Kattan, M. W. (2010). Assessing the performance of prediction models: A framework for traditional and novel measures. Epidemiology, 21(1), 128–138. https://doi.org/10.1097/EDE.0b013e3181c30fb2
Stone, A. A., Schneider, S., & Smyth, J. M. (2023). Evaluation of pressing issues in ecological momentary assessment. Annual Review of Clinical Psychology, 19(1), 107–131. https://doi.org/10.1146/annurev-clinpsy-080921-083128
Strauss, M. E., & Smith, G. T. (2009). Construct validity: Advances in theory and methodology. Annual Review of Clinical Psychology, 5(1), 1–25. https://doi.org/10.1146/annurev.clinpsy.032408.153639
Sullivan, H. S. (1970). The psychiatric interview. Norton.
Summerfeldt, L. J., Kloosterman, P. H., & Antony, M. M. (2010). Structured and semistructured diagnostic interviews. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (2nd ed., pp. 95–137). Guilford Press.
Suzuki, L. A., Onoue, M. A., & Hill, J. S. (2013). Clinical assessment: A multicultural perspective. In K. F. Geisinger, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 2: Testing and assessment in clinical and counseling psychology (pp. 193–212). American Psychological Association.
Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological science can improve diagnostic decisions. Psychological Science in the Public Interest, 1, 1–26. https://doi.org/10.1111/1529-1006.001
Tackett, J. L., Brandes, C. M., King, K. M., & Markon, K. E. (2019). Psychology’s replication crisis and clinical psychological science. Annual Review of Clinical Psychology, 15(1), 579–604. https://doi.org/10.1146/annurev-clinpsy-050718-095710
Tackett, J. L., Brandes, C. M., & Reardon, K. W. (2019). Leveraging the open science framework in clinical psychological assessment research. Psychological Assessment, 31(12), 1386–1394. https://doi.org/10.1037/pas0000583
Tackett, J. L., Lang, J. W. B., Markon, K. E., & Herzhoff, K. (2019). A correlated traits, correlated methods model for thin-slice child personality assessment. Psychological Assessment, 31(4), 545–556. https://doi.org/10.1037/pas0000635
Tervalon, M., & Murray-Garcia, J. (1998). Cultural humility versus cultural competence: A critical distinction in defining physician training outcomes in multicultural education. Journal of Health Care for the Poor and Underserved, 9(2), 117–125.
Tetlock, P. E. (2017). Expert political judgment: How good is it? How can we know? - New edition. Princeton University Press.
Textor, J., van der Zander, B., & Ankan, A. (2021). dagitty: Graphical analysis of structural causal models. https://CRAN.R-project.org/package=dagitty
Textor, J., Zander, B. van der, Gilthorpe, M. S., Liśkiewicz, M., & Ellison, G. T. (2017). Robust causal inference using directed acyclic graphs: The R package “dagitty”. International Journal of Epidemiology, 45(6), 1887–1894. https://doi.org/10.1093/ije/dyw341
Thomas, M. L. (2019). Advances in applications of item response theory to clinical assessment. Psychological Assessment, 31(12), 1442–1455. https://doi.org/10.1037/pas0000597
Thorndike, R. L. (1971). Concepts of culture-fairness. Journal of Educational Measurement, 8(2), 63–70. https://doi.org/10.1111/j.1745-3984.1971.tb00907.x
Tiego, J., Martin, E. A., DeYoung, C. G., Hagan, K., Cooper, S. E., Pasion, R., Satchell, L., Shackman, A. J., Bellgrove, M. A., Fornito, A., Abend, R., Goulter, N., Eaton, N. R., Kaczkurkin, A. N., & and, R. N. (2023). Precision behavioral phenotyping as a strategy for uncovering the biological correlates of psychopathology. Nature Mental Health, 1, 304–315. https://doi.org/10.1038/s44220-023-00057-5
Tofallis, C. (2015). A better measure of relative prediction accuracy for model selection and model estimation. Journal of the Operational Research Society, 66(8), 1352–1362. https://doi.org/10.1057/jors.2014.103
Tong, Y., & Kolen, M. J. (2007). Comparisons of methodologies and results in vertical scaling for educational achievement tests. Applied Measurement in Education, 20(2), 227–253. https://doi.org/10.1080/08957340701301207
Toomey, R. B., Syvertsen, A. K., & Shramko, M. (2018). Transgender adolescent suicide behavior. Pediatrics, 142(4). https://doi.org/10.1542/peds.2017-4218
Trafimow, D. (2015). A defense against the alleged unreliability of difference scores. Cogent Mathematics, 2(1), 1064626. https://doi.org/10.1080/23311835.2015.1064626
Treat, T. A., McFall, R. M., Viken, R. J., Kruschke, J. K., Nosofsky, R. M., & Wang, S. S. (2007). Clinical cognitive science: Applying quantitative models of cognitive processing to examine cognitive aspects of psychopathology. In R. W. J. Neufeld (Ed.), Advances in clinical cognitive science: Formal modeling of processes and symptoms (pp. 179–205). American Psychological Association.
Treat, T. A., & Viken, R. J. (2023). Measuring test performance with signal detection theory techniques. In H. Cooper, M. N. Coutanche, L. M. McMullen, A. T. Panter, D. Rindskopf, & K. J. Sher (Eds.), APA handbook of research methods in psychology: Foundations, planning, measures, and psychometrics (2nd ed., Vol. 1, pp. 837–858). American Psychological Association.
Treiblmaier, H., Bentler, P. M., & Mair, P. (2011). Formative constructs implemented via common factors. Structural Equation Modeling: A Multidisciplinary Journal, 18(1), 1–17. https://doi.org/10.1080/10705511.2011.532693
Trull, T. J., & Ebner-Priemer, U. (2013). Ambulatory assessment. Annual Review of Clinical Psychology, 9, 151–176. https://doi.org/10.1146/annurev-clinpsy-050212-185510
Trull, T. J., & Ebner-Priemer, U. W. (2020). Ambulatory assessment in psychopathology research: A review of recommended reporting guidelines and current practices. Journal of Abnormal Psychology, 129(1), 56–63. https://doi.org/10.1037/abn0000473
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124
Ursenbach, J., O’Connell, M. E., Neiser, J., Tierney, M. C., Morgan, D., Kosteniuk, J., & Spiteri, R. J. (2019). Scoring algorithms for a computer-based cognitive screening tool: An illustrative example of overfitting machine learning approaches and the impact on estimates of classification accuracy. Psychological Assessment, 31(11), 1377–1382. https://doi.org/10.1037/pas0000764
Van De Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J., & Muthen, B. (2013). Facing off with scylla and charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Frontiers in Psychology, 4(770). https://doi.org/10.3389/fpsyg.2013.00770
Van De Schoot, R., Schmidt, P., De Beuckelaer, A., Lek, K., & Zondervan-Zwijnenburg, M. (2015). Editorial: Measurement invariance. Frontiers in Psychology, 6(1064). https://doi.org/10.3389/fpsyg.2015.01064
van der Nest, G., Lima Passos, V., Candel, M. J. J. M., & van Breukelen, G. J. P. (2020). An overview of mixture modelling for latent evolutions in longitudinal data: Modelling approaches, fit statistics and software. Advances in Life Course Research, 43, 100323. https://doi.org/10.1016/j.alcr.2019.100323
Vaz, S., Falkmer, T., Passmore, A. E., Parsons, R., & Andreou, P. (2013). The case for using the repeatability coefficient when calculating test–retest reliability. PLOS ONE, 8(9), e73990. https://doi.org/10.1371/journal.pone.0073990
Vispoel, W. P., Hong, H., & Lee, H. (2023). Benefits of doing generalizability theory analyses within structural equation modeling frameworks: Illustrations using the Rosenberg self-esteem scale. Structural Equation Modeling: A Multidisciplinary Journal, 1–17. https://doi.org/10.1080/10705511.2023.2187734
Vispoel, W. P., Lee, H., Xu, G., & Hong, H. (2022). Integrating bifactor models into a generalizability theory based structural equation modeling framework. The Journal of Experimental Education, 1–21. https://doi.org/10.1080/00220973.2022.2092833
Vispoel, W. P., Morris, C. A., & Kilinc, M. (2018). Applications of generalizability theory and their relations to classical test theory and structural equation modeling. Psychological Methods, 23(1), 1–26. https://doi.org/10.1037/met0000107
Vispoel, W. P., Morris, C. A., & Kilinc, M. (2019). Using generalizability theory with continuous latent response variables. Psychological Methods, 24(2), 153–178. https://doi.org/10.1037/met0000177
Voorhees, C. M., Brady, M. K., Calantone, R., & Ramirez, E. (2016). Discriminant validity testing in marketing: An analysis, causes for concern, and proposed remedies. Journal of the Academy of Marketing Science, 44(1), 119–134. https://doi.org/10.1007/s11747-015-0455-4
Wakschlag, L. S., Tolan, P. H., & Leventhal, B. L. (2010). Research review: “Ain’t misbehavin”: Towards a developmentally-specified nosology for preschool disruptive behavior. Journal of Child Psychology and Psychiatry, 51(1), 3–22. https://doi.org/10.1111/j.1469-7610.2009.02184.x
Wang, S., Jiao, H., & Zhang, L. (2013). Validation of longitudinal achievement constructs of vertically scaled computerised adaptive tests: A multiple-indicator, latent-growth modelling approach. International Journal of Quantitative Research in Education, 1(4), 383–407. https://doi.org/10.1504/IJQRE.2013.058307
Wang, T., Merkle, E. C., & Zeileis, A. (2014). Score-based tests of measurement invariance: Use in practice. Frontiers in Psychology, 5. https://doi.org/10.3389/fpsyg.2014.00438
Wang, W.-C., Shih, C.-L., & Yang, C.-C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69(5), 713–731. https://doi.org/10.1177/0013164409332228
Wang, Y. A., & Rhemtulla, M. (2021). Power analysis for parameter estimation in structural equation modeling: A discussion and tutorial. Advances in Methods and Practices in Psychological Science, 4(1), 1–17.
Watkins, C. E., Campbell, V. L., Nieberding, R., & Hallmark, R. (1995). Contemporary practice of psychological assessment by clinical psychologists. Professional Psychology: Research and Practice, 26(1), 54–60. https://doi.org/10.1037/0735-7028.26.1.54
Webb, N. M., & Shavelson, R. J. (2005). Generalizability theory: overview. In B. S. Everitt & D. C. Howell (Eds.), Encyclopedia of statistics in behavioral science (Vol. 2, pp. 717–719). John Wiley & Sons, Ltd.
Weems, C. F. (2008). Developmental trajectories of childhood anxiety: Identifying continuity and change in anxious emotion. Developmental Review, 28(4), 488–502. https://doi.org/10.1016/j.dr.2008.01.001
Wei, T., & Simko, V. (2021). R package “corrplot": Visualization of a correlation matrix. https://github.com/taiyun/corrplot
Weintraub, S., Bauer, P. J., Zelazo, P. D., Wallner-Allen, K., Dikmen, S. S., Heaton, R. K., Tulsky, D. S., Slotkin, J., Blitz, D. L., Carlozzi, N. E., Havlik, R. J., Beaumont, J. L., Mungas, D., Manly, J. J., Borosh, B. G., Nowinski, C. J., & Gershon, R. C. (2013). I. NIH toolbox cognition battery (CB): Introduction and pediatric data. Monographs of the Society for Research in Child Development, 78(4), 1–15. https://doi.org/10.1111/mono.12031
Weiss, B., & Garber, J. (2003). Developmental differences in the phenomenology of depression. Development and Psychopathology, 15(2), 403–430. https://doi.org/10.1017/S0954579403000221
Whitbourne, S. K. (2019). Longitudinal, cross-sectional, and sequential designs in lifespan developmental psychology. Oxford University Press.
Wicherts, J. M., & Dolan, C. V. (2010). Measurement invariance in confirmatory factor analysis: An illustration using IQ test performance of minorities. Educational Measurement: Issues and Practice, 29(3), 39–47. https://doi.org/10.1111/j.1745-3992.2010.00182.x
Wickham, H. (2021). tidyverse: Easily install and load the tidyverse. https://CRAN.R-project.org/package=tidyverse
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Widiger, T. A. (2002). Personality disorders. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (pp. 453–480). Guilford Publications.
Wiggins, J. S. (1973). Personality and prediction: Principles of personality assessment. Addison-Wesley.
Willett, W. (2012). Correction for the effects of measurement error. In W. Willett (Ed.), Nutritional epidemiology (3rd ed., pp. 287–304). Oxford University Press.
Williams, A. J., Botanov, Y., Kilshaw, R. E., Wong, R. E., & Sakaluk, J. K. (2021). Potentially harmful therapies: A meta-scientific review of evidential value. Clinical Psychology: Science and Practice, 28(1), 5–18. https://doi.org/10.1111/cpsp.12331
Wood, J. M., Garb, H. N., Lilienfeld, S. O., & Nezworski, M. T. (2002). Clinical assessment. Annual Review of Psychology, 53(1), 519. https://doi.org/10.1146/annurev.psych.53.100901.135136
Wood, J. M., Nezworski, M. T., Garb, H. N., & Lilienfeld, S. O. (2001). Problems with the norms of the Comprehensive System for the Rorschach: Methodological and conceptual considerations. Clinical Psychology: Science and Practice, 8(3), 397–402. https://doi.org/10.1093/clipsy.8.3.397
Wood, J. M., Nezworski, M. T., & Stejskal, W. J. (1996a). The Comprehensive System for the Rorschach: A critical examination. Psychological Science, 7(1), 3–10. https://doi.org/10.1111/j.1467-9280.1996.tb00658.x
Wood, J. M., Nezworski, M. T., & Stejskal, W. J. (1996b). Thinking critically about the Comprehensive System for the Rorschach: A reply to exner. Psychological Science, 7(1), 14–17. https://doi.org/10.1111/j.1467-9280.1996.tb00660.x
Wood, J. M., Teresa, P. M., Garb, H. N., & Lilienfeld, S. O. (2001). The misperception of psychopathology: Problems with the norms of the Comprehensive System for the Rorschach. Clinical Psychology: Science and Practice, 8(3), 350–373. https://doi.org/10.1093/clipsy.8.3.350
Woody, M. L., & Gibb, B. E. (2015). Integrating NIMH Research Domain Criteria (RDoC) into depression research. Current Opinion in Psychology, 4, 6–12. https://doi.org/10.1016/j.copsyc.2015.01.004
Wright, A. G. C., Gates, K. M., Arizmendi, C., Lane, S. T., Woods, W. C., & Edershile, E. A. (2019). Focusing personality assessment on the person: Modeling general, shared, and person specific processes in personality and psychopathology. Psychological Assessment, 31(4), 502–515. https://doi.org/10.1037/pas0000617
Wright, A. G. C., & Woods, W. C. (2020). Personalized models of psychopathology. Annual Review of Clinical Psychology, 16(1), 49–74. https://doi.org/10.1146/annurev-clinpsy-102419-125032
Wright, A. G. C., & Zimmermann, J. (2019). Applied ambulatory assessment: Integrating idiographic and nomothetic principles of measurement. Psychological Assessment, 31(12), 1467–1480. https://doi.org/10.1037/pas0000685
Xie, Y. (2015). Dynamic documents with R and knitr (2nd ed.). Chapman; Hall/CRC.
Xie, Y. (2022a). bookdown: Authoring books and technical documents with R Markdown. https://CRAN.R-project.org/package=bookdown
Xie, Y. (2022b). knitr: A general-purpose package for dynamic report generation in R. https://yihui.org/knitr/
Yang, Y., & Land, K. C. (2013). Age-period-cohort analysis: New models, methods, and empirical applications. Taylor & Francis.
Youngstrom, E. A., Halverson, T. F., Youngstrom, J. K., Lindhiem, O., & Findling, R. L. (2018). Evidence-based assessment from simple clinical judgments to statistical learning: Evaluating a range of options using pediatric bipolar disorder as a diagnostic challenge. Clinical Psychological Science, 6(2), 243–265. https://doi.org/10.1177/2167702617741845
Youngstrom, E. A., & Van Meter, A. (2016). Empirically supported assessment of children and adolescents. Clinical Psychology: Science and Practice, 23(4), 327–347. https://doi.org/10.1111/cpsp.12172
Youngstrom, E. A., Van Meter, A., Frazier, T. W., Hunsley, J., Prinstein, M. J., Ong, M.-L., & Youngstrom, J. K. (2017). Evidence-based assessment as an integrative model for applying psychological science to guide the voyage of treatment. Clinical Psychology: Science and Practice, 24(4), 331–363. https://doi.org/10.1111/cpsp.12207
Yu, X., Schuberth, F., & Henseler, J. (2023). Specifying composites in structural equation modeling: A refinement of the Henseler-Ogasawara specification. Statistical Analysis and Data Mining: The ASA Data Science Journal, 16(4), 348–357. https://doi.org/10.1002/sam.11608
Yudell, M., Roberts, D., DeSalle, R., & Tishkoff, S. (2016). Taking race out of human genetics. Science, 351(6273), 564–565. https://doi.org/10.1126/science.aac4951
Zhang, J., & Mueller, S. T. (2005). A note on ROC analysis and non-parametric estimate of sensitivity. Psychometrika, 70(1), 203–212. https://doi.org/10.1007/s11336-003-1119-8
Zhang, X., & Savalei, V. (2024). An overview of alternative formats to the likert format: A comment on wilson et al. (2022). Psychological Methods, 29(3), 606–612. https://doi.org/10.1037/met0000631
Zieky, M. J. (2006). Fairness review in assessment. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 359–376). Routledge. https://doi.org/10.4324/9780203874776.ch16
Zieky, M. J. (2013). Fairness review in assessment. In K. F. Geisinger, B. A. Bracken, J. F. Carlson, J.-I. C. Hansen, N. R. Kuncel, S. P. Reise, & M. C. Rodriguez (Eds.), APA handbook of testing and assessment in psychology, Vol. 1: Test theory and testing and assessment in industrial and organizational psychology (pp. 293–302). American Psychological Association. https://doi.org/10.1037/14047-017
Zuckerman, M. (1990). Some dubious premises in research and theory on racial differences: Scientific, social, and ethical issues. American Psychologist, 45(12), 1297–1303. https://doi.org/10.1037/0003-066X.45.12.1297