The National Institute of Standards and Technology is developing a mass spectral library of all identifiable components derived from the digestion of monoclonal antibodies, including peptides, glycans, and glycopeptides. This evaluated tandem mass spectral library aims to provide reference data for laboratories using mass spectrometry to discover and identify monoclonal antibody structures and modifications through the fragmentation of their ions generated by electrospray ionization. It is an extension of the NIST/EPA/NIH Mass Spectral Library and the NIST Peptide Mass Spectral Libraries. Using mass spectral libraries to identify these compounds is more sensitive and robust than interpreting the mass spectra by theoretical methods. These databases are available for testing and integration with existing instrumentation (peptides: http://peptide.nist.gov, glycans: http://chemdata.nist.gov/mass-spc/msms-search/).
Modern mass spectrometers used in the field of proteomics and glycomics are capable of profiling hundreds, and even thousands of molecules in a single experiment. Each of these compounds is isolated and fragmented to form a mass spectrum. Therefore, interpretation of these mass spectra is a critical step in the experimental workflow. Since peptide and glycan mass spectra represent physical properties of these molecules, standard interpretation of these mass spectra has the potential to improve the success rate of all discovery experiments in proteomics and glycomics.
Biological mass spectrometry is a critical tool in understanding monoclonal antibodies. NIST researchers are using their expertise in building mass spectral libraries for other small molecules to compile a comprehensive library of consensus mass spectra of peptides, glycans, glycopeptides, and other important compounds. Developing a standard method for interpreting these mass spectral data and a comprehensive library of high quality tandem mass spectra is critical for establishing and advancing this technology.
While the mass spectrometers used to identify peptides in proteomics have improved greatly over that past ten years, computer algorithms for peptide identification have not. Traditionally, this process involves a step wherein theoretical peptide fragmentation spectra are predicted from protein sequences. These spectra typically contain peaks at the correct m/z values but contain little or no information about their relative intensities (i.e. peak heights) or less common fragmentation products. |
Screenshot of the MS Search 2.0 software showing a library match of an unknown glycan with a glycan in the library. |
Ion components of a tryptic digest of cetuximab. |
Moreover, hundreds of glycans may be detected in a single experiment but the identification of these compounds depends on the availability of libraries of standard reference mass spectra. Mass spectral libraries are built from measured spectra of known compounds and enable the use of sensitive search algorithms. The use of these algorithms and libraries (1) will lead to a higher percentage of identified spectra at the same level of reliability and (2) will greatly increase the robustness of the glycan identification step. |
The data for this project is both being generated 'in-house' at NIST and collected from many outside sources. NIST also has data exchange agreements with several international proteomics data repositories in order to efficiently share the most relevant data.
To date, the small molecule mass spectral library contains >120,000 spectra for >15,000 ions of >7,000 compounds of biological and environmental relevance (including metabolites, bioactive peptides, amino acids and small peptides, sugars and glycans, lipids and phospholipids, drugs, pesticides, surfactants, and various contaminants). Several of the peptide libraries, including human, yeast and E. coli, represent significant coverage of the proteomes and are suitable for routine uses.
The spectra in these libraries have the features of being experimentally validated, critically evaluated, and annotated with great detail. Their use, in combination or as an alternative to sequence-based identification methods, has been shown to double the number of peptide identifications for some data sets.
NIST Peptide Mass Spectral Libraries for Proteomics and Chemical Analysis
NIST/EPA/NIH Mass Spectral Library with Search Program (SRD 1A) – an interactive access to experimental data produced at NIST and by its customers organized by SRM.
NIST Chemistry Webbook
Electron Ionization Library Component of the NIST/EPA/NIH Mass Spectral Library and NIST GC Retention Index Database
Mass Spectrometry Tools