Development of a large urban longitudinal HIV clinical cohort using a web-based platform to merge electronically and manually abstracted data from disparate medical record systems: Technical challenges and innovative solutions

Document Type

Journal Article

Publication Date



Journal of the American Medical Informatics Association








Cohort; DC Cohort; Electronic medical record; EMR; HIV


© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. Objective Electronic medical records (EMRs) are being increasingly utilized to conduct clinical and epidemiologic research in numerous fields. To monitor and improve care of HIV-infected patients in Washington, DC, one of the most severely affected urban areas in the United States, we devel- oped a city-wide database across 13 clinical sites using electronic data abstraction and manual data entry from EMRs. Materials and Methods To develop this unique longitudinal cohort, a web-based electronic data capture system (Discovere®) was used. An Agile software development methodology was implemented across multiple EMR platforms. Clinical informatics staff worked with information technology specialists from each site to abstract data electronically from each respective site's EMR through an extract, transform, and load process. Results Since enrollment began in 2011, more than 7000 patients have been enrolled, with longitudinal clinical data available on all patients. Data sets are produced for scientific analyses on a quarterly basis, and benchmarking reports are generated semi-annually enabling each site to com- pare their participants' clinical status, treatments, and outcomes to the aggregated summaries from all other sites. Discussion Numerous technical challenges were identified and innovative solutions developed to ensure the successful implementation of the DC Cohort. Central to the success of this project was the broad collaboration established between government, academia, clinics, community, information technology staff, and the patients themselves. Conclusions Our experiences may have practical implications for researchers who seek to merge data from diverse clinical databases, and are applicable to the study of health related issues beyond HIV.