Milken Institute School of Public Health Poster Presentations (Marvin Center & Video)

Poster Number

61

Document Type

Poster

Status

Graduate Student - Doctoral

Abstract Category

Epidemiology and Biostatistics

Keywords

dementia, cognitive assessments, HRS, methods

Publication Date

Spring 2018

Abstract

Background: Dementia ascertainment is difficult and costly, hindering the use of large, representative studies such as the Health and Retirement Study (HRS) to monitor trends or disparities in dementia. To address this issue, multiple groups of researchers have developed algorithms to classify dementia status in HRS participants using data from HRS and the Aging, Demographics, and Memory Study (ADAMS), an HRS sub-study that systematically ascertained dementia status. However, the relative performance of each algorithm has not been systematically evaluated.

Objective: To compare the performance of five existing algorithms, overall and by sociodemographic subgroups.

Methods: We created two standardized datasets: (a) training data (N=786, i.e. ADAMS Wave A and corresponding HRS data, which was used previously to create the algorithms) and (b) validation data (N=530, i.e. ADAMS Waves B, C, and D and corresponding HRS data which was not used previously to create the algorithms). In both, we used each algorithm to classify HRS participants as demented or not demented and compared the algorithmic diagnoses to the ADAMS diagnoses.

Results: In the training data, overall classification accuracies ranged from 80% to 87%, sensitivity ranged from 53% to 90%, and specificity ranged from 79% to 96% across the five algorithms. Though overall classification accuracy was similar in the validation data (range: 79% to 88%), sensitivity was much lower (range: 17% to 61%), while specificity was higher (range: 82% to 98%) compared to the training data. Classification accuracy was generally worse in non-Hispanic blacks (range: 68% to 85%) and Hispanics (range: 65% to 88%), compared to non-Hispanic whites (range: 79% to 88%). Across datasets, sensitivity was generally higher for proxy-respondents, while specificity (and overall accuracy) was higher for self-respondents.

Conclusions: Worse sensitivity in the validation dataset may suggest either overfitting or that the algorithms are better at identifying prevalent versus incident dementia, while differences in performance across algorithms suggest that the usefulness of each will vary depending on the user’s purpose. Further planned work will evaluate algorithm performance in external validation datasets.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Open Access

1

Comments

Presented at GW Annual Research Days 2018.

Share

COinS
 

Comparison of existing methods for algorithmic classification of dementia in the Health and Retirement Study

Background: Dementia ascertainment is difficult and costly, hindering the use of large, representative studies such as the Health and Retirement Study (HRS) to monitor trends or disparities in dementia. To address this issue, multiple groups of researchers have developed algorithms to classify dementia status in HRS participants using data from HRS and the Aging, Demographics, and Memory Study (ADAMS), an HRS sub-study that systematically ascertained dementia status. However, the relative performance of each algorithm has not been systematically evaluated.

Objective: To compare the performance of five existing algorithms, overall and by sociodemographic subgroups.

Methods: We created two standardized datasets: (a) training data (N=786, i.e. ADAMS Wave A and corresponding HRS data, which was used previously to create the algorithms) and (b) validation data (N=530, i.e. ADAMS Waves B, C, and D and corresponding HRS data which was not used previously to create the algorithms). In both, we used each algorithm to classify HRS participants as demented or not demented and compared the algorithmic diagnoses to the ADAMS diagnoses.

Results: In the training data, overall classification accuracies ranged from 80% to 87%, sensitivity ranged from 53% to 90%, and specificity ranged from 79% to 96% across the five algorithms. Though overall classification accuracy was similar in the validation data (range: 79% to 88%), sensitivity was much lower (range: 17% to 61%), while specificity was higher (range: 82% to 98%) compared to the training data. Classification accuracy was generally worse in non-Hispanic blacks (range: 68% to 85%) and Hispanics (range: 65% to 88%), compared to non-Hispanic whites (range: 79% to 88%). Across datasets, sensitivity was generally higher for proxy-respondents, while specificity (and overall accuracy) was higher for self-respondents.

Conclusions: Worse sensitivity in the validation dataset may suggest either overfitting or that the algorithms are better at identifying prevalent versus incident dementia, while differences in performance across algorithms suggest that the usefulness of each will vary depending on the user’s purpose. Further planned work will evaluate algorithm performance in external validation datasets.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.