Categories
Nevin Manimala Statistics

scTrans: Sparse attention powers fast and accurate cell type annotation in single-cell RNA-seq data

PLoS Comput Biol. 2025 Apr 4;21(4):e1012904. doi: 10.1371/journal.pcbi.1012904. eCollection 2025 Apr.

ABSTRACT

Cell type annotation is crucial in single-cell RNA sequencing data analysis because it enables significant biological discoveries and deepens our understanding of tissue biology. Given the high-dimensional and highly sparse nature of single-cell RNA sequencing data, most existing annotation tools focus on highly variable genes to reduce dimensionality and computational load. However, this approach inevitably results in information loss, potentially weakening the model’s generalization performance and adaptability to novel datasets. To mitigate this issue, we developed scTrans, a single cell Transformer-based model, which employs sparse attention to utilize all non-zero genes, thereby effectively reducing the input data dimensionality while minimizing information loss. We validated the speed and accuracy of scTrans by performing cell type annotation on 31 different tissues within the Mouse Cell Atlas. Remarkably, even with datasets nearing a million cells, scTrans efficiently perform cell type annotation in limited computational resources. Furthermore, scTrans demonstrates strong generalization capabilities, accurately annotating cells in novel datasets and generating high-quality latent representations, which are essential for precise clustering and trajectory analysis.

PMID:40184563 | DOI:10.1371/journal.pcbi.1012904

By Nevin Manimala

Portfolio Website for Nevin Manimala