Categories
Nevin Manimala Statistics

Nonrandom Missingness in Child Race and Ethnicity Records and the US Federal Data Standards: Pooled Analysis of Community-Based Child Health Studies

JMIR Public Health Surveill. 2025 Nov 13;11:e65660. doi: 10.2196/65660.

ABSTRACT

BACKGROUND: Racism perpetuates the unequal distribution of power, resources, and privilege within and between societies to the detriment of marginalized groups. Racialization involves categorizing people based on traits to which socially constructed meaning and value have been ascribed. In public health, this process can manifest when tracking racial health disparities in children, which requires aggregating parent-reported race and ethnicity data into federally recognized categories. The demographic surveys used to characterize children’s identity in the United States mirror those administered in adults and typically follow federal race and ethnicity data standards, which include ambiguous response options (eg, other race), “select all that apply” directives, and open-ended fields followed by a request specification, with limited guidance for coding and interpretation. These methodological challenges could contribute to nonrandom data missingness and misclassification bias and must be resolved to better harmonize historic data, especially given recent revisions to the country’s federal race and ethnicity data standards.

OBJECTIVE: We aimed to explore the prevalence of systematic bias within past, current, and recently revised federal race and ethnicity data standards in the United States and develop a standardized method for improving the reporting of child race and ethnicity in public health research, policy, and practice.

METHODS: We developed a replicable decision-making process to uncover racial heterogeneity obscured by key components of US federal race and ethnicity data standards (open-ended and ambiguous response fields). We applied it to a pooled sample of 8 community-based child health studies with 8087 participants and examined changes in the dataset’s racial and ethnic diversity.

RESULTS: Overall, 93.11% (7530/8087) of parents provided child race and ethnicity data, with 3.73% (281/7530) identified as other race and 9.72% (732/7530) identified as multiracial. In total, 101 distinct open-ended written responses (eg, “Haitian”) were provided. The replicable decision-making process resulted in 4.02% (303/7530) of children being reallocated from their parent-reported race or ethnicity category, of whom 38.6% (117/303) were moved into the Black category based on written responses. Within the multiracial group, we identified 22 unique combinations, including White-Hispanic (269/732, 36.7%) and White-Black (169/732, 23.08%).

CONCLUSIONS: These findings demonstrate how the current paradigm of assessing race and ethnicity in the United States may contribute to the erasure and further marginalization of individuals disproportionately enduring the effects of racism. While updated federal race and ethnicity data standards may soon take effect, persistent gaps in demographic and health surveillance will remain. Our data reallocation decision-making process offers a novel and practical framework for harmonizing race and ethnicity data across time, populations, and datasets, emphasizing the relevance and longevity of preexisting datasets and tools. Efforts to build equitable public health surveillance and data systems should expand the survey response options, avoid aggregating diverse populations, and develop new statistical techniques for data analysis.

PMID:41232027 | DOI:10.2196/65660

By Nevin Manimala

Portfolio Website for Nevin Manimala