Categories
Nevin Manimala Statistics

Accurately Assigning Peptides to Spectra When Only a Subset of Peptides Are Relevant

J Proteome Res. 2021 Jul 8. doi: 10.1021/acs.jproteome.1c00483. Online ahead of print.

ABSTRACT

The standard proteomics database search strategy involves searching spectra against a peptide database and estimating the false discovery rate (FDR) of the resulting set of peptide-spectrum matches. One assumption of this protocol is that all the peptides in the database are relevant to the hypothesis being investigated. However, in settings where researchers are interested in a subset of peptides, alternative search and FDR control strategies are needed. Recently, two methods were proposed to address this problem: subset-search and all-sub. We show that both methods fail to control the FDR. For subset-search, this failure is due to the presence of “neighbor” peptides, which are defined as irrelevant peptides with a similar precursor mass and fragmentation spectrum as a relevant peptide. Not considering neighbors compromises the FDR estimate because a spectrum generated by an irrelevant peptide can incorrectly match well to a relevant peptide. Therefore, we have developed a new method, “subset-neighbor search” (SNS), that accounts for neighbor peptides. We show evidence that SNS controls the FDR when neighbors are present and that SNS outperforms group-FDR, the only other method that appears to control the FDR relative to a subset of relevant peptides.

PMID:34236864 | DOI:10.1021/acs.jproteome.1c00483

By Nevin Manimala

Portfolio Website for Nevin Manimala