TY - JOUR
T1 - External validation of a shortened screening tool using individual participant data meta-analysis
T2 - A case study of the Patient Health Questionnaire-Dep-4
AU - Harel, Daphna
AU - Levis, Brooke
AU - Sun, Ying
AU - Fischer, Felix
AU - Ioannidis, John P.A.
AU - Cuijpers, Pim
AU - Patten, Scott B.
AU - Ziegelstein, Roy C.
AU - Markham, Sarah
AU - Benedetti, Andrea
AU - Thombs, Brett D.
AU - He, Chen
AU - Wu, Yin
AU - Krishnan, Ankur
AU - Mani Bhandari, Parash
AU - Neupane, Dipika
AU - Negeri, Zelalem
AU - Imran, Mahrukh
AU - Rice, Danielle B.
AU - Riehm, Kira E.
AU - Azar, Marleine
AU - Levis, Alexander W.
AU - Boruff, Jill
AU - Gilbody, Simon
AU - Kloda, Lorie A.
AU - Amtmann, Dagmar
AU - Ayalon, Liat
AU - Baradaran, Hamid R.
AU - Beraldi, Anna
AU - Bernstein, Charles N.
AU - Bhana, Arvin
AU - Imma Buji, Ryna
AU - Chagas, Marcos H.
AU - C. N. Chan, Juliana
AU - Fong Chan, Lai
AU - Chibanda, Dixon
AU - Conway, Aaron
AU - Daray, Federico M.
AU - de Man-van Ginkel, Janneke M.
AU - Diez-Quevedo, Crisanto
AU - Field, Sally
AU - R. W. Fisher, Jane
AU - Fung, Daniel
AU - Garman, Emily C.
AU - Flisher, Alan J.
AU - Gelaye, Bizu
AU - Gholizadeh, Leila
AU - Gibson, Lorna J.
AU - Green, Eric P.
AU - Hall, Brian J.
N1 - Publisher Copyright: © 2021 Elsevier Inc.
PY - 2022/8
Y1 - 2022/8
N2 - Shortened versions of self-reported questionnaires may be used to reduce respondent burden. When shortened screening tools are used, it is desirable to maintain equivalent diagnostic accuracy to full-length forms. This manuscript presents a case study that illustrates how external data and individual participant data meta-analysis can be used to assess the equivalence in diagnostic accuracy between a shortened and full-length form. This case study compares the Patient Health Questionnaire-9 (PHQ-9) and a 4-item shortened version (PHQ-Dep-4) that was previously developed using optimal test assembly methods. Using a large database of 75 primary studies (34,698 participants, 3,392 major depression cases), we evaluated whether the PHQ-Dep-4 cutoff of ≥ 4 maintained equivalent diagnostic accuracy to a PHQ-9 cutoff of ≥ 10. Using this external validation dataset, a PHQ-Dep-4 cutoff of ≥ 4 maximized the sum of sensitivity and specificity, with a sensitivity of 0.88 (95% CI 0.81, 0.93), 0.68 (95% CI 0.56, 0.78), and 0.80 (95% CI 0.73, 0.85) for the semi-structured, fully structured, and MINI reference standard categories, respectively, and a specificity of 0.79 (95% CI 0.74, 0.83), 0.85 (95% CI 0.78, 0.90), and 0.83 (95% CI 0.80, 0.86) for the semi-structured, fully structured, and MINI reference standard categories, respectively. While equivalence with a PHQ-9 cutoff of ≥ 10 was not established, we found the sensitivity of the PHQ-Dep-4 to be non-inferior to that of the PHQ-9, and the specificity of the PHQ-Dep-4 to be marginally smaller than the PHQ-9.
AB - Shortened versions of self-reported questionnaires may be used to reduce respondent burden. When shortened screening tools are used, it is desirable to maintain equivalent diagnostic accuracy to full-length forms. This manuscript presents a case study that illustrates how external data and individual participant data meta-analysis can be used to assess the equivalence in diagnostic accuracy between a shortened and full-length form. This case study compares the Patient Health Questionnaire-9 (PHQ-9) and a 4-item shortened version (PHQ-Dep-4) that was previously developed using optimal test assembly methods. Using a large database of 75 primary studies (34,698 participants, 3,392 major depression cases), we evaluated whether the PHQ-Dep-4 cutoff of ≥ 4 maintained equivalent diagnostic accuracy to a PHQ-9 cutoff of ≥ 10. Using this external validation dataset, a PHQ-Dep-4 cutoff of ≥ 4 maximized the sum of sensitivity and specificity, with a sensitivity of 0.88 (95% CI 0.81, 0.93), 0.68 (95% CI 0.56, 0.78), and 0.80 (95% CI 0.73, 0.85) for the semi-structured, fully structured, and MINI reference standard categories, respectively, and a specificity of 0.79 (95% CI 0.74, 0.83), 0.85 (95% CI 0.78, 0.90), and 0.83 (95% CI 0.80, 0.86) for the semi-structured, fully structured, and MINI reference standard categories, respectively. While equivalence with a PHQ-9 cutoff of ≥ 10 was not established, we found the sensitivity of the PHQ-Dep-4 to be non-inferior to that of the PHQ-9, and the specificity of the PHQ-Dep-4 to be marginally smaller than the PHQ-9.
KW - Equivalence testing
KW - Optimal test assembly
KW - Self-report questionnaire
KW - Sensitivity
KW - Specificity
UR - http://www.scopus.com/inward/record.url?scp=85120806887&partnerID=8YFLogxK
U2 - https://doi.org/10.1016/j.ymeth.2021.11.005
DO - https://doi.org/10.1016/j.ymeth.2021.11.005
M3 - مقالة
C2 - 34780986
SN - 1046-2023
VL - 204
SP - 300
EP - 311
JO - Methods
JF - Methods
ER -