Mol Biol Evol. 2026 Apr 30:msag111. doi: 10.1093/molbev/msag111. Online ahead of print.
ABSTRACT
At the molecular level, selection pressures often act on protein structural features, yet most evolutionary analyses remain confined to linear sequences. Early structure-informed approaches improved interpretability by mapping single-site metrics onto protein structures, and later methods introduced 3D sliding windows to capture spatially clustered signals missed by linear window approaches. These frameworks, however, are restricted to predefined statistics and narrowly defined 3D window types, limiting the scope of questions that can be addressed. We developed an R package, evo3D, as a new framework for structure-informed evolutionary analysis that supports a wide range of downstream statistics and scales from simple to complex structures. evo3D extracts structure-informed multiple sequence alignment subsets (spatial haplotypes), making the structure-informed unit of analysis directly available to users. The framework supports fixed-count and fixed-distance spatial windows, introduces residue and codon analysis modes, and extends to multimers, interfaces, and multiple structural models through a single wrapper, run_evo3d(). We demonstrate evo3D’s utility by performing an epitope-level diversity scan of Hepatitis C virus E1/E2 complex, identifying conserved spatial neighbourhoods missed by linear sliding windows, and by evaluating evo3D’s scalability on the octameric Chikungunya virus E1/E2 assembly. Importantly, evo3D formalises the core components of structure-informed analysis of molecular evolution and removes technical barriers. As a result, the framework streamlines the evaluation of evolutionary patterns directly within 3D structural contexts, and we anticipate its wide application in molecular evolution studies. The package is available at github.com/bbroyle/evo3D.
PMID:42060840 | DOI:10.1093/molbev/msag111