Categories
Nevin Manimala Statistics

Rule-Based Algorithm to Identify Recurrent Non-Hodgkin Lymphoma in Electronic Health Data

JCO Clin Cancer Inform. 2026 Jul;10(3):e2500208. doi: 10.1200/CCI-25-00208. Epub 2026 Jul 1.

ABSTRACT

PURPOSE: Recurrent cancers are not captured in a standardized way by US tumor registries, making it difficult to conduct research on risk factors for cancer recurrence. We developed rule-based algorithms to be used with electronic health data to identify recurrent cases of diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL).

METHODS: Incident DLBCL and FL cases (2000-2018) were identified in tumor registry data at two health plan study sites. We captured pharmacy and procedure codes to indicate first-line treatment initiation. Recurrent cases were defined as those who completed first-line treatment followed by ≥6 months with no treatment-related codes, but who later restarted treatment. The baseline algorithm was built using a claims-based database from Fallon Health (FH; Massachusetts) and tested using electronic health records and claims data at Henry Ford Health (Michigan). Results were validated by chart review at Henry Ford, and measures of validity calculated overall and by subtype. The algorithm was subsequently revised to reduce the false-positive rate.

RESULTS: FH identified 137 DLBCL and 88 FL eligible cases; 42 patients met the baseline algorithm-defined criteria for recurrent disease. Henry Ford identified 246 DLBCL and 146 FL cases. The baseline algorithm identified 115 recurrent cases with a 54% false-positive rate; the revised algorithm (R2D-non-Hodgkin lymphoma [NHL]) identified 60 recurrent cases, with a 10% false-positive rate. Following chart review, the R2D-NHL algorithm had a sensitivity of 74%, specificity of 90%, negative predictive value of 83%, and positive predictive value of 83%. Measures varied slightly between subtypes.

CONCLUSION: We developed a rule-based algorithm that can be applied to electronic health data for population-based research requiring the identification of recurrence for two common but dissimilar NHL subtypes.

PMID:42385099 | DOI:10.1200/CCI-25-00208

By Nevin Manimala

Portfolio Website for Nevin Manimala