Brief Bioinform. 2025 Aug 31;26(5):bbaf573. doi: 10.1093/bib/bbaf573.
ABSTRACT
Compared to traditional sequence-based methods, artificial intelligence (AI) approaches offer distinct advantages, such as significantly improved structural recognition efficiency and the ability to overcome inherent limitations of sequence alignment. Here, we introduce an AI-driven framework designed to discover synthetic binding proteins (SBPs)-like scaffolds from the entire known proteome. The framework integrates a deep learning-based FoldSeek with our in-house developed holistic protein attributes assessment (HP2A) algorithm, and enables subsequent protein function annotation and evolutionary analysis. As a proof-of-concept, four representative SBPs, including Affibody, Anticalin, DARPin, and Fynome, were used as query to discover SBP-like scaffolds. The results demonstrate that some of the identified SBP-like proteins, despite their low sequence similarity (identity ≤0.3), exhibit significant structural resemblance to the templates (template modeling score (TM-score) ≥ 0.5), highlighting the large sequence space available within specific protein scaffold. Statistical analysis identifies key biophysical properties that contribute to privileged scaffold functionality. Additionally, evolutionary insights derived from potential SBP-like scaffolds provide valuable guidance for protein binder design, as validated through targeted sequence analysis and in silico site-directed mutagenesis. This work highlights the potential of our framework to facilitate the discovery of high-quality engineered protein scaffolds, paving the way for the development of novel SBPs.
PMID:41165486 | DOI:10.1093/bib/bbaf573