Categories
Nevin Manimala Statistics

A Scalable Framework for Comprehensive Typing of Polymorphic Immune Genes from Long-Read Data

Adv Sci (Weinh). 2026 Feb 11:e21531. doi: 10.1002/advs.202521531. Online ahead of print.

ABSTRACT

Long-read sequencing promises to unravel the complexity of polymorphic immune genes including HLA, KIR, IG, and TCR, yet existing tools fall short in accuracy and scope. Here, we present SpecImmune, the first unified computational framework to simultaneously genotype these genes alongside the intricate CYP family from long-read data. Employing an iterative graph-based haplotype reconstruction algorithm, SpecImmune delivers precise diploid assemblies for each locus from diverse data types. Validated on 1019 samples from the 1kGP ONT cohort, 42 PacBio CLR and 9 PacBio HiFi samples from HGSVC, and 47 PacBio HiFi plus 37 ONT samples from HPRC, SpecImmune achieved 98% four-field HLA typing accuracy, surpassing HLA*LA by 11% and SpecHLA by 12%. It also delivers robust KIR and germline IG/TCR genotyping and supports multi-locus CYP allele detection, positioning it among the first integrated long-read solutions across immune gene families. Beyond superior performance, SpecImmune uncovers elevated germline IG/TCR heterozygosity in African populations ( p = 9.45 × 10 86 $p=9.45times 10^{-86}$ ) and, through 1kGP analysis, suggests widespread cross-family co-evolution, clustering immune genes into two functionally distinct communities: the Integrated Immune-Metabolic Community and the Adaptive Presentation Community. Additionally, it enables allele-specific drug dosing recommendations and offers flexible customization for new loci, advancing immunology, precision medicine, and evolutionary genomics.

PMID:41669879 | DOI:10.1002/advs.202521531

By Nevin Manimala

Portfolio Website for Nevin Manimala