High-Throughput Discovery of Inhibitory Protein Fragments with AlphaFold

High-Precision Prediction of Protein Fragment Inhibitory Activity: The Application of FragFold

Academic Background

Protein interactions play a crucial role in cellular life activities, and peptides or protein fragments can regulate protein functions by binding to specific protein interfaces, even acting as inhibitors. Recent developments in high-throughput experimental techniques have made it possible to measure the inhibitory activity of protein fragments on a large scale in living cells. However, there has been no corresponding computational method to predict which protein fragments can bind to target proteins and act as inhibitors, let alone predict their binding modes. This research gap prompted researchers to develop new computational tools to address this issue.

The introduction of AlphaFold revolutionized protein structure prediction, but its application in predicting the binding of protein fragments to full-length proteins was still limited. To fill this gap, Andrew Savinov and colleagues developed a computational method called FragFold, aiming to leverage AlphaFold’s high-throughput prediction capabilities to predict the binding modes and inhibitory activities of protein fragments on a large scale.

Paper Source

This paper was co-authored by Andrew Savinov, Sebastian Swanson, Amy E. Keating, and Gene-Wei Li from the Department of Biology, Department of Biological Engineering, and Koch Institute for Integrative Cancer Research at MIT. The study was published on February 3, 2025, in the Proceedings of the National Academy of Sciences (PNAS), titled “High-throughput discovery of inhibitory protein fragments with AlphaFold”.

Research Workflow

1. Development of the FragFold Method

At its core, FragFold uses AlphaFold2’s monomer model weights through the ColabFold platform to perform high-throughput predictions of protein fragment binding to full-length proteins. To avoid memorizing natural protein binding data during model training, the researchers chose AlphaFold2’s monomer model, which was trained solely on single-chain data. To accelerate multiple sequence alignment (MSA) generation, they optimized traditional MSA generation steps by pre-generating MSAs for full proteins and trimming them to generate MSAs for each fragment. These MSAs were then input into AlphaFold2 to generate structural models of protein fragment-target protein binding.

FragFold’s innovation lies in focusing not just on AlphaFold’s confidence metrics (such as pLDDT and ipTM), but also on the binding contact points (n_contacts) in the generated structural models, weighted by the predicted interface TM score (ipTM). By doing so, FragFold can more accurately predict the binding modes of protein fragments.

2. Application to Known Protein Interfaces

Researchers first applied FragFold to known protein-protein interaction interfaces, particularly the FtsZ protein in E. coli. FtsZ is a structural protein involved in cell division, and multiple regions of its polymerization interface have been experimentally shown to exhibit inhibitory activity. FragFold successfully predicted binding peaks corresponding to these inhibitory activity peaks, and the predicted binding modes were highly consistent with experimentally determined native binding modes. Specifically, FragFold predicted binding modes at four major inhibitory peaks (1, 1’, 2, 2’) that exhibited spatial conformations similar to crystal structures, with RMSD (root mean square deviation) values all less than 3 Å.

3. Application to Diverse Proteins

To verify the general applicability of FragFold, researchers applied it to various structurally and functionally diverse proteins, including 50S ribosomal subunit protein L7/L12, DNA gyrase subunit A (GyrA), single-stranded DNA binding protein (SSB), etc. On these proteins, FragFold also demonstrated high prediction accuracy, successfully predicting the binding modes of 87% of known protein interface inhibitory fragments. For example, the dimerization domain fragment of L7/L12 was predicted to bind to L7/L12 with an RMSD of 1.2 Å and a similarity to the native binding mode of 93%.

4. Predicting Unknown Binding Modes

Another important application of FragFold is predicting unknown protein fragment binding modes. Researchers applied FragFold to the C-terminal intrinsically disordered tail of FtsZ, a region unresolved in crystal structures but known to interact with proteins such as FtsA, MinC, and ZipA during cell division. FragFold successfully predicted the binding modes of FtsZ C-terminal fragments to these proteins, aligning well with existing genetic and biochemical data. For instance, FragFold predicted that FtsZ C-terminal fragments could bind to the GTPase active site of FtsZ, preventing its binding to the T7 loop of adjacent FtsZ monomers, consistent with molecular dynamics simulations.

5. Deep Mutational Scanning Validation

To further validate the predicted binding modes of FragFold, researchers conducted deep mutational scanning (DMS) on multiple inhibitory fragments. By mutating each residue of each fragment and measuring changes in inhibitory activity in living cells, they found that key residues predicted by FragFold showed significant changes in inhibitory activity upon mutation. For example, mutations in certain residues of FtsZ fragments led to a substantial decrease in inhibitory activity, while mutations in other residues enhanced it. These experimental results further support the accuracy of FragFold’s predicted binding modes.

Main Results and Conclusions

FragFold successfully predicted the binding modes and inhibitory activities of multiple protein fragments, achieving an accuracy rate of 87% for known protein-protein interaction interfaces. Additionally, FragFold could predict unknown binding modes, such as those involving the C-terminal intrinsically disordered tail of FtsZ interacting with FtsA, MinC, and ZipA. These predictions not only aligned with existing genetic and biochemical data but also provided new molecular models for the regulation mechanisms of these proteins.

Deep mutational scanning experiments further validated the accuracy of FragFold’s predictions, revealing the critical roles of key residues in the inhibitory function of protein fragments. These experimental results not only supported FragFold’s predictive capability but also provided essential experimental evidence for designing more efficient protein inhibitors.

Significance and Value of the Study

The development of FragFold provides a powerful computational tool for large-scale discovery of protein fragment inhibitors. By combining AlphaFold’s high-precision structure prediction capabilities with high-throughput experimental data, FragFold can accurately predict the binding modes and inhibitory activities of protein fragments. This method not only helps researchers better understand the functional mechanisms of protein fragments but also offers new insights for drug development. For example, by predicting and screening protein fragments with inhibitory activity, FragFold can accelerate the development of novel peptide-based drugs.

Furthermore, the automated prediction workflow of FragFold provides new tools for future proteomics research. Systematically scanning functional protein fragments across the entire proteome, FragFold has the potential to reveal more protein interaction networks and provide new insights into the regulatory mechanisms of cellular life activities.

Research Highlights

  1. High Accuracy: FragFold achieves an accuracy rate of 87% in predicting known protein interfaces and can predict unknown binding modes.
  2. Automated Workflow: FragFold uses an automated prediction workflow to efficiently scan inhibitory fragments across the entire proteome.
  3. Experimental Validation: Deep mutational scanning experiments further validate FragFold’s predictions, revealing the functional roles of key residues.
  4. Wide Applications: FragFold can be applied not only to known protein interfaces but also to predict binding modes of intrinsically disordered protein regions, providing new tools for protein function studies.

The development of FragFold marks a significant advancement in the field of protein fragment prediction, offering broad prospects for future research and applications.