Categories
Nevin Manimala Statistics

Identifying People Living With or Those at Risk for HIV in a Nationally Sampled Electronic Health Record Repository Called the National Clinical Cohort Collaborative: Computational Phenotyping Study

JMIR Med Inform. 2025 Jul 11;13:e68143. doi: 10.2196/68143.

ABSTRACT

BACKGROUND: Electronic health records (EHRs) provide valuable insights to address clinical and epidemiological research concerning HIV, including the disproportionate impact of the COVID-19 pandemic on people living with HIV. To identify this population, most studies using EHR or claims databases start with diagnostic codes, which can result in misclassification without further refinement using drug or laboratory data. Furthermore, given that antiretrovirals now have indications for both HIV and COVID-19 (ie, ritonavir in nirmatrelvir/ritonavir), new phenotyping methods are needed to better capture people living with HIV. Therefore, we created a generalizable and innovative method to robustly identify people living with HIV, preexposure prophylaxis (PrEP) users, postexposure prophylaxis (PEP) users, and people not living with HIV using granular clinical data after the emergence of COVID-19.

OBJECTIVE: The primary aim of this study was to use computational phenotyping in EHR data to identify people living with HIV (cohort 1), PrEP users (cohort 2), PEP users (cohort 3), or “none of the above” (people not living with HIV; cohort 4) and describe COVID-19-related characteristics among these cohorts.

METHODS: We used diagnostic and laboratory measurements and drug concepts in the National Clinical Cohort Collaborative to create a computational phenotype for the 4 cohorts with confidence levels. For robustness, we conducted a randomly sampled, blinded clinician annotation to assess precision. We calculated the distribution of demographics, comorbidities, and COVID-19 variables among the 4 cohorts.

RESULTS: We identified 132,664 people living with HIV with a high level of confidence, 36,088 PrEP users, 4120 PEP users, and 20,639,675 people not living with HIV. Most people living with HIV were identified by a combination of medical conditions, laboratory measurements, and drug exposures (74,809/132,664, 56.4%), followed by laboratory measurements and drug exposures (15,241/132,664, 11.5%) and then by medical conditions and drug exposures (14,595/132,664, 11%). A higher proportion of people living with HIV experienced COVID-19-related hospitalization (4650,132,664, 3.5%) or mortality (828/132,664, 0.6%) and all-cause mortality (2083/132,664, 1.6%) compared to other cohorts.

CONCLUSIONS: Using an extensive phenotyping algorithm leveraging granular data in an EHR repository, we have identified people living with HIV, people not living with HIV, PrEP users, and PEP users. Our findings offer transferable lessons to optimize future EHR phenotyping for these cohorts.

PMID:40644699 | DOI:10.2196/68143

By Nevin Manimala

Portfolio Website for Nevin Manimala