J Chem Inf Model. 2022 Jul 29. doi: 10.1021/acs.jcim.2c00620. Online ahead of print.
Tandem mass spectrometry (MS/MS) is a primary tool for the identification of small molecules and metabolites where resultant spectra are most commonly identified by matching them with spectra in MS/MS reference libraries. The high degree of variability in MS/MS spectrum acquisition techniques and parameters creates a significant challenge for building standardized reference libraries. Here we present a method to improve the usefulness of existing MS/MS libraries by augmenting available experimental spectra data sets with statistically interpolated spectra at unreported collision energies. We find that highly accurate spectral approximations can be interpolated from as few as three experimental spectra and that the interpolated spectra will be consistent with true spectra gathered from the same instrument as the experimental spectra. Supplementing existing spectral databases with interpolated spectra yields consistent improvements to identification accuracy on a range of instruments and precursor types. Applying this method yields significant improvements (∼10% more spectra correctly identified) on large data sets (2000-10 000 spectra), indicating this is a quick yet adept tool for improving spectral matching in situations where available reference libraries are not yet sufficient. We also find improvements of matching spectra across instrument types (between an Agilent Q-TOF and an Orbitrap Elite), at high collision energies (50-90 eV), and with smaller data sets available through MassBank.