Tandem mass-spectrometry has become a method of choice for high-throughput, quantitative analysis in proteomics. Being located next to a world-leading mass-spectrometry facility, we have a keen interest in applying bioinformatics methods to proteomics data and developing new methods.
Since the link between peptides and the proteins they originate from is broken in shotgun proteomics, peptide identification relies on matching of the fragmentation spectra to theoretical spectra of candidate peptides. We are developing a method for efficiently predicting the dominant peaks in the fragmentation spectra of the candidate peptides. The algorithm uses higher-order Hidden Markov Models to learn the relationship between the sequence of a parent ion and how it will most commonly fragment. By predicting which peaks are dominant, we aim to improve the accuracy with which peptides can be matched, thereby increasing both the number of proteins that can be identified and the sequence coverage.
Analysis of in-house data
We collaborate with all groups in the Proteomics program to perform advanced analyses of the mass-spectrometry datasets that they produce. The projects are too many and too frequently changing to list here, but most related to analysis of post-translational modifications, including their regulation, evolutionary conservation, and structural context. Besides numerous joint publications, such collaborations have resulted in several of the proteomics tools developed within the group.