Categories
Nevin Manimala Statistics

Short-range forecasting of coronavirus disease 2019 (COVID-19) during early onset at county, health district, and state geographic levels: Comparative forecasting approach using seven forecasting methods

J Med Internet Res. 2021 Feb 22. doi: 10.2196/24925. Online ahead of print.

ABSTRACT

BACKGROUND: Forecasting methods rely on trends and averages of prior observations to forecast coronavirus disease 2019 (COVID-19) case counts. COVID-19 forecasts have received much media attention and numerous platforms have been created to inform the public. However, forecasting effectiveness varies by geographic scope and are affected by changing assumptions in behaviors and preventative measures in response to the pandemic. Due to time requirements for developing a COVID-19 vaccine, evidence is needed to inform short-term forecasting method selection at county, health district, and state levels.

OBJECTIVE: COVID-19 forecasts keep the public informed and contribute to public policy. As such, proper understanding of forecasting purposes and outcomes is needed to advance knowledge of health statistics for policy makers and the public. Using publicly available real-time data provided online, we evaluate the performance of seven forecasting methods utilized to forecast cumulative COVID-19 case counts. Forecasts are evaluated based on how well they forecast one-. three-, and seven-days forward when utilizing one-, three-, seven-, or all-prior days’ cumulative case counts during early onset. This study provides an objective evaluation of the forecasting methods to identify forecasting model assumptions that contribute to lower error in forecasting COVID-19 cumulative case growth. This information benefits professionals, decision makers, and the public relying on the data provided by short-term case count estimates at varied geographic levels.

METHODS: One-, three-, and seven-days forecasts are created at the county, health district, and state levels using: (1) a naïve approach; (2) Holt-Winters exponential smoothing (HW); (3) growth rate (Growth); (4) moving average (MA); (5) autoregressive (AR); (6) autoregressive moving average (ARMA); and (7) autoregressive integrated moving average (ARIMA). Forecasts rely on Virginia’s 3,463 historical county-level cumulative case counts from March 7 – April 22, 2020, as reported by The New York Times. Statistically significant results are identified using 95% confidence intervals of Median Absolute Error (MdAE) and Median Absolute Percentage Error (MdAPE) error metrics of the resulting 216,698 forecasts.

RESULTS: Next-day MA forecast with three-day lookback obtained the lowest MdAE (0.67, 0.49-0.84, P < .001) and statistically significantly differs from 39 (66.1%) to 53 (89.8%) of alternatives at each geographic level at a significance level of 0.01. For short-range forecasting, methods assuming stationary means of prior days’ counts outperform methods with assumptions of weak- or non-stationarity means. MdAPE results reveal statistically significant differences across geographic levels.

CONCLUSIONS: For short-range COVID-19 cumulative case count forecasting at the county, health district, and state levels during early onset: (1) MA is effective for forecasting one-, three-, and seven-days’ cumulative case counts; (2) exponential growth is not the best representation of case growth during early onset when the public is aware of the virus; and (3) geographic resolution is a factor in forecasting method selection. (This work received no external funding.).

PMID:33621186 | DOI:10.2196/24925

By Nevin Manimala

Portfolio Website for Nevin Manimala