Categories
Nevin Manimala Statistics

Forensic inference in Africa: Evaluating population structure, databases, and regional assignment accuracy

Forensic Sci Int Genet. 2026 Feb 4;84:103441. doi: 10.1016/j.fsigen.2026.103441. Online ahead of print.

ABSTRACT

This study reports novel 21 aSTR (autosomal Short Tandem Repeats) allele frequencies from 538 individuals, as well as 11 triallelic profiles, representing seven Bantu-speaking groups in Southern Africa (Ndebele, Pedi, Phuthi, Tsonga, Sotho, Swati, and Xhosa). These data contributed to a comprehensive representation of the Southern Bantu (SB). The defined SB reference database was evaluated for various forensic uses and applications: extant diversity, population structure, adequacy of alternative reference databases, and continental biogeographical ancestry prediction. Different analytical methods-including summary statistics, multivariate analyses (Multidimensional Scaling, MDS; Discriminant Analysis of Principal Components, DAPC), and Bayesian clustering-detected continental structure, identifying four major clusters: Southern, Eastern, Western, and Horn of Africa. This observation motivated the evaluation of two practical applications of this information: one methodological (alternative reference frequency database) and one predictive (biogeographic assignment). The adequacy of alternative reference databases for representing SB populations-STRidER South Africa, STRidER Africa, African American, and global datasets-was assessed by comparing reciprocal allelic coverage and shifts in random match probabilities (RMPs). Of the databases tested, the STRidER Africa database provided the closest representation of the SB. Population-level analyses evidenced the need for a stratification correction (θ = 0.005 or 0.01) for SB populations. Intracontinental biogeographic prediction was assessed using an XGBoost machine learning classification model across four major African regions. The model’s predictive balanced accuracy ranged from 80 % to 94 % across African regions (94 % for the Horn of Africa, 87 % for Southern Africa, 84 % for Western Africa, and 80 % for Eastern Africa). The accuracy and limitations of this practice are discussed, along with its ethical implications. The assessment of reference databases can be extended to more general applications across Africa.

PMID:41702037 | DOI:10.1016/j.fsigen.2026.103441

By Nevin Manimala

Portfolio Website for Nevin Manimala