Paediatr Perinat Epidemiol. 2022 Jan 25. doi: 10.1111/ppe.12863. Online ahead of print.
BACKGROUND: Large-scale evaluation of COVID-19 is likely to rely on the quality of ICD coding. However, little is known about the validity of ICD-coded COVID-19 diagnoses.
OBJECTIVES: To evaluate the performance of diagnostic codes in detecting COVID-19 during pregnancy.
METHODS: We used data from a national cohort of 78,283 individuals with a pregnancy ending between 11 March 2020 and 31 January 2021 in the OptumLabs® Data Warehouse (OLDW). OLDW is a longitudinal, real-world data asset with de-identified administrative claims and electronic health record data. We identified all services with an ICD-10-CM diagnostic code of U07.1 and all laboratory claims records for COVID-19 diagnostic testing. We compared ICD-coded diagnoses to testing results to estimate positive and negative predictive values (PPV and NPV). To evaluate impact on risk estimation, we estimated risk of adverse pregnancy outcomes by source of exposure information.
RESULTS: Of 78,283 pregnancies, 5644 had a laboratory test result for COVID-19. Testing was most common among older individuals, Hispanic individuals, those with higher socioeconomic status and those with a diagnosed medical condition or pregnancy complication; 52% of COVID-19 cases was identified through ICD-coded diagnosis alone, 19% from laboratory test results alone and 29% from both sources. Agreement between ICD-coded diagnosis and laboratory testing records was high 91% (95% confidence interval [CI] 90, 92). However, the PPV of ICD-code diagnosis was low (36%; 95% CI 33, 39). We observed up to a 50% difference in risk estimates of adverse pregnancy outcomes when exposure was based on laboratory testing results or diagnostic coding alone.
CONCLUSIONS: More than one-in-five COVID-19 cases would be missed by using ICD-coded diagnoses alone to identify COVID-19 during pregnancy. Epidemiological studies exclusively relying on diagnostic coding or laboratory testing results are likely to be affected by exposure misclassification. Research and surveillance should draw upon multiple sources of COVID-19 diagnostic information.