Geographic disparities and methodological quality of type 2 diabetes prediction models: a systematic review and meta-analysis of 97 models

BMC Endocr Disord. 2026 May 6. doi: 10.1186/s12902-026-02301-2. Online ahead of print.

ABSTRACT

IMPORTANCE: Accurate risk prediction is essential for targeted prevention of type 2 diabetes mellitus (T2DM). However, the global applicability and methodological rigor of existing prediction models remain uncertain.

OBJECTIVE: To systematically review and meta-analyze the geographic distribution, methodological quality, and predictive performance of all published T2DM risk prediction models.

DATA SOURCES: PubMed, Embase, Web of Science, Cochrane Library, CNKI, WanFang, and VIP were searched from inception to December 2025 (eAppendix S1 in the Supplement).

STUDY SELECTION: Studies that developed or validated a multivariable prediction model for incident T2DM in general adult populations and reported at least one performance measure.

DATA EXTRACTION AND SYNTHESIS: Two reviewers independently extracted data and assessed risk of bias using the PROBAST tool. A random-effects meta-analysis was used to pool C-statistics. Heterogeneity was explored via subgroup analyses and meta-regression. The study followed TRIPOD-SRMA and PRISMA reporting guidelines.

MAIN OUTCOMES AND MEASURES: The primary outcome was the geographic origin of models. Secondary outcomes included pooled measures of discrimination (C-statistic/AUC) stratified by region and an overall assessment of methodological quality (PROBAST).

RESULTS: A total of 65 studies comprising 97 distinct prediction models were included (eTable 1). Geographic distribution was highly skewed, with 70.1% of models developed in Asian populations (China: 47.4%; Japan: 13.4%; South Korea: 9.3%), while only 7.2% originated from the US and 4.1% from Europe. Logistic regression was used in 97.9% of models. External validation was performed for only 21 models (21.6%). According to PROBAST, 91.8% of models were at high risk of bias (eTable 2), primarily due to inadequate handling of missing data, lack of external validation, and poor calibration reporting. Meta-analysis revealed wide variation in discrimination by geographic region (eTable 7). US-based models achieved the highest pooled AUC (0.97; 95% CI, 0.94-0.99), but this finding is likely influenced by overfitting, small sample bias, and publication bias (see Discussion). European models showed a pooled AUC of 0.84 (0.81-0.87), while Chinese models showed the lowest performance (AUC, 0.79; 0.76-0.82). Due to very high heterogeneity (I² > 80% in most regions), these pooled estimates should be interpreted as descriptive summaries rather than precise estimates of true regional performance. Performance was lowest in prediabetic cohorts (AUC, 0.72; 0.68-0.76); however, this finding is preliminary due to the limited number of models and high heterogeneity. Funnel plot asymmetry suggested potential publication bias (Egger’s test p=0.03); The most frequently included predictors were age (69.1%), body mass index (64.9%), family history of diabetes (44.3%), and waist circumference (39.2%) (eFigure 4 and eTable 3).

CONCLUSIONS AND RELEVANCE: T2DM prediction models exhibit striking geographic inequity and poor methodological quality, with over 90% at high risk of bias. The substantial variation in performance by region and the lack of external validation critically limit their global clinical utility. These findings underscore an urgent need for rigorous external validation in diverse populations and de novo model development in under-represented regions, guided by PROBAST and TRIPOD standards.

TRIAL REGISTRATION: Not applicable.

CLINICAL TRIAL NUMBER: Not applicable.

PMID:42087146 | DOI:10.1186/s12902-026-02301-2

By Nevin Manimala