Econ Hum Biol. 2021 Feb 12;41:100988. doi: 10.1016/j.ehb.2021.100988. Online ahead of print.
ABSTRACT
In the U.S. in early 2020, heterogenous and incomplete county-scale data on COVID-19 hindered effective interventions in the pandemic. While numbers of deaths can be used to estimate actual number of infections after a time lag, counties with low death counts early on have considerable uncertainty about true numbers of cases in the future. Here we show that supplementing county-scale mortality statistics with socioeconomic data helps estimate true numbers of COVID-19 infections in low-data counties, and hence provide an early warning of future concern. We fit a LASSO negative binomial regression to select a parsimonious set of five predictive variables from thirty-one county-level covariates. Of these, population density, public transportation use, voting patterns and % African-American population are most predictive of higher COVID-19 death rates. To test the model, we show that counties identified as under-estimating COVID-19 on an early date (April 17) have relatively higher deaths later (July 1) in the pandemic.
PMID:33636583 | DOI:10.1016/j.ehb.2021.100988