Improving classification of myocardial infarction with machine learning in a diverse population

Am J Epidemiol. 2025 Oct 7:kwaf223. doi: 10.1093/aje/kwaf223. Online ahead of print.

ABSTRACT

Phenotype classification with electronic health record (EHR) data is increasingly performed with ML, however their performance in diverse populations remains understudied. We compared an ICD-based algorithm with an ML phenotyping pipeline to classify myocardial infarction (MI) in a general and self-reported Black population. We determined the impact of differential performance by replicating a published MI risk factor study with MI defined by the ICD or ML algorithms. Individuals followed in the Veterans Health Administration (VHA) EHR with data from 2002 to 2019 were examined: 11,523,175 Veterans, mean age 67.5 years, 93.8% male, 14.3% Black, 79.1% White. MI was classified using a published rule-based ICD algorithm and an ML pipeline, PheCAP which incorporates natural language processing. Algorithms were trained and validated against n=403 Veterans randomly selected and chart-reviewed for MI (gold standard), oversampled for self-reported Black. Among chart-reviewed Veterans, the ICD algorithm had high PPV and low sensitivity (all race, PPV:0.97, sensitivity:0.17; Black Veterans, PPV:0.94, sensitivity:0.24). PheCAP MI had good PPV and higher sensitivity (all race, PPV:0.90, sensitivity:0.66; Black, PPV:0.81, sensitivity:0.79). Applying PheCAP MI to the entire VHA population to classify MI provided increased power to replicate findings from the published MI risk factor study compared to the ICD algorithm.

PMID:41054913 | DOI:10.1093/aje/kwaf223

By Nevin Manimala