Int J Biostat. 2026 Jun 1. doi: 10.1515/ijb-2025-0066. Online ahead of print.
ABSTRACT
In right-skewed count data, the mean is disproportionately affected by a long upper tail, whereas the median remains a more representative measure of central tendency. Discrete Weibull (DW) regression links covariates to a shifted median, which in turn induces the exact integer median; however, a single DW component can fit poorly when the observed count distribution has a markedly heavier upper tail than a single-component model can accommodate. We propose a contaminated DW (cDW) regression that augments the baseline DW distribution with a more dispersed secondary component within a finite mixture while retaining a single shifted-median link. This mixture accommodates extreme counts more effectively, thereby stabilizing the median-based regression coefficients. The model accommodates general lower truncation at an arbitrary threshold c, including c = 1 for strictly positive outcomes and c = 0 for nonnegative counts, and is estimated using a straightforward Bayesian Markov chain Monte Carlo algorithm implemented in JAGS; R code accompanies the paper. Applied to hospital length-of-stay data, the cDW regression reduces the influence of outliers and achieves superior predictive performance relative to a single-component DW model, as demonstrated by leave-one-out cross-validation and a Kullback-Leibler influence diagnostic. Simulation experiments show that, under strongly heavy-tailed mixture settings, the cTDW model accurately recovers the regression coefficients and improves on the single-component TDW model. Because the added tail component can increase probability mass at both extremes, we further recommend embedding the cDW in a hurdle framework when structural zeros are present: the zero probability is modeled separately, and the heavy-tail mixture is applied only to positive counts. The cDW regression model provides a robust, median-centered alternative for analyzing skewed, possibly truncated count outcomes.
PMID:42227217 | DOI:10.1515/ijb-2025-0066