Nevin Manimala Statistics

Subtype-MGTP: a cancer subtype identification framework based on Multi-Omics translation

Bioinformatics. 2024 Jun 10:btae360. doi: 10.1093/bioinformatics/btae360. Online ahead of print.


MOTIVATION: The identification of cancer subtypes plays a crucial role in cancer research and treatment. With the rapid development of high-throughput sequencing technologies, there has been an exponential accumulation of cancer multi-omics data. Integrating multi-omics data has emerged as a cost-effective and efficient strategy for cancer subtyping. While current methods primarily rely on genomics data, protein expression data offers a closer representation of phenotype. Therefore, integrating protein expression data holds promise for enhancing subtyping accuracy. However, the scarcity of protein expression data compared to genomics data presents a challenge in its direct incorporation into existing methods. Moreover, striking a balance between omics-specific learning and cross-omics learning remains a prevalent challenge in current multi-omics integration methods.

RESULTS: We introduce Subtype-MGTP, a novel cancer subtyping framework based on the translation of Multiple Genomics To Proteomics. Subtype-MGTP comprises two modules: a translation module, which leverages available protein data to translate multi-type genomics data into predicted protein expression data, and an improved deep subspace clustering module, which integrates contrastive learning to cluster the predicted protein data, yielding refined subtyping results. Extensive experiments conducted on benchmark datasets demonstrate that Subtype-MGTP outperforms nine state-of-the-art cancer subtyping methods. The interpretability of clustering results is further supported by the clinical and survival analysis. Subtype-MGTP also exhibits strong robustness against varying rates of missing protein data and demonstrates distinct advantages in integrating multi-omics data with imbalanced multi-omics data.

AVAILABILITY AND IMPLEMENTATION: The code and results are available at

PMID:38857453 | DOI:10.1093/bioinformatics/btae360

By Nevin Manimala

Portfolio Website for Nevin Manimala