Relative Quantification of Proteins and Post-Translational Modifications in Proteomic Experiments with Shared Peptides: A Weight-Based Approach

In proteomics research, mass spectrometry (MS) is widely used to analyze changes in protein abundance and structure. However, protein quantification faces a critical challenge: many proteins share the same peptides (shared peptides), meaning these peptides appear in the sequences of multiple proteins. Traditional methods typically rely solely on unique peptides for protein quantification, ignoring the information from shared peptides, which can lead to biased or inaccurate quantification results. This issue is particularly complex when studying protein isoforms or post-translational modifications (PTMs), where the presence of shared peptides complicates quantitative analysis.

To address this issue, researchers have proposed a new statistical method aimed at leveraging the quantitative information from shared peptides to more accurately estimate protein abundance and PTM site occupancy. The method models the quantitative patterns of shared peptides as a convex combination of the abundances of individual proteins or modification sites, estimating the abundance of each source along with its weights, thereby improving the precision of quantitative analysis.

Source of the Paper

The study was conducted by a team from multiple institutions, including the University of Wrocław (Poland), Hasselt University (Belgium), Northeastern University (USA), Genentech, and Pfizer, among others. The primary authors of the paper include Mateusz Staniak, Ting Huang, Amanda M. Figueroa-Navedo, and others, with Olga Vitek as the corresponding author. The paper was published in Bioinformatics in 2025, titled Relative quantification of proteins and post-translational modifications in proteomic experiments with shared peptides: a weight-based approach.

Research Process and Results

1. Research Design

The study proposes a new statistical model for simultaneously estimating the abundance of multiple proteins or PTM sites in the presence of shared peptides. The method is based on quantitative information from mass spectrometry experiments, particularly those using isobaric labeling techniques (e.g., Tandem Mass Tags, TMT). The research team developed an open-source R package, msstatsweightedsummary, to implement this method.

2. Model Construction

The core idea of the model is to represent the quantitative patterns of shared peptides as weighted combinations of the abundances of multiple proteins or PTM sites. Specifically, for each peptide, the model estimates its contribution weights to different proteins or PTM sites and calculates the abundance of each protein or PTM site based on these weights. The model is formulated as:

[ x{cf} = \mu + \sum{k \in V(f)} \text{weight}_{fk} (\text{protein}k + \text{channel}{kc}) + \text{feature}f + \epsilon{cf} ]

Here, (x_{cf}) represents the log2-intensity of peptide (f) in channel (c), (\mu) represents the overall abundance mean, (\text{protein}k) represents the abundance of protein (k), (\text{channel}{kc}) represents the effect of channel (c) on protein (k), (\text{feature}f) represents the peptide-specific effect, and (\epsilon{cf}) represents random error.

3. Optimization and Implementation

To estimate the model parameters, the research team employed an iterative optimization algorithm. First, the initial protein abundances were estimated based on unique peptides. Then, the weights of shared peptides and protein abundances were updated iteratively until the weights converged. The method uses the Huber loss function to handle outliers, ensuring the robustness of the model.

4. Experimental Results

The research team validated the effectiveness of the method using both simulated and experimental data. In simulated data, the method significantly improved the precision of log2-fold change estimation, especially when proteins had only a few unique peptides. In practical experiments, the method was successfully applied to various scenarios, including protein degradation studies, thermal proteome stability analysis, and PTM quantification, demonstrating its broad applicability in different biological studies.

4.1 Protein Degradation Study

In the protein degradation study, the research team analyzed the degradation kinetics of BET bromodomain proteins. By incorporating information from shared peptides, the method successfully distinguished the degradation rates of different proteins, validating its effectiveness in real-world applications.

4.2 Thermal Proteome Profiling

In thermal proteome profiling, the research team compared protein stability at different temperatures. By introducing quantitative information from shared peptides, the method improved the sensitivity of detecting changes in protein thermal stability, particularly for proteins with only a few unique peptides.

4.3 PTM Quantification

In PTM quantification, the research team studied changes in phosphorylation sites. By incorporating quantitative information from shared peptides, the method successfully distinguished the change patterns of different phosphorylation sites, improving the accuracy of PTM quantification.

5. Conclusion

The study proposes a weighted statistical method based on shared peptides, significantly improving the precision of protein and PTM quantification. By modeling the quantitative patterns of shared peptides, the method addresses the bias issue in traditional methods when shared peptides are present, providing a new tool for proteomics research.

Research Highlights

  1. Innovative Method: The study is the first to propose a weighted statistical model based on shared peptides, filling a gap in proteomics quantitative analysis.
  2. Broad Applicability: The method is not only applicable to protein quantification but also to PTM site quantification, with wide-ranging potential applications.
  3. Open-Source Tool: The research team developed the open-source R package msstatsweightedsummary, making it accessible for other researchers to use and extend the method.

Research Significance

The study provides new insights and methods for quantitative analysis in proteomics, significantly improving the accuracy and reliability of quantification results, especially when dealing with shared peptides. The application of this method will contribute to a deeper understanding of protein function and regulatory mechanisms, advancing the use of proteomics in biomedical research.