Biotherapeutics: what do we make next?
The question of what we should make next has challenged the world of drug discovery for decades.
For years, computational methods for small molecule drug design have offered numerous algorithms and methodologies to help generate new ideas and guide the iterative process of lead design and optimisation. For a particular drug target, these methods help to identify high-quality candidates that may eventually advance to clinical development with fewer experiments and less time in the lab. We have seen great advancements over the years and most recently, AI and machine learning algorithms have become popular in allowing researchers to rapidly explore more ideas in the chemical space and propose novel structures that a medicinal chemist may not have considered trying out when looking for new drugs.
Traditionally, the computational design tools for biotherapeutics required more expertise and they were application-specific compared to the tools that exist for small molecule therapeutics. For designing certain types of biological therapies, such as monoclonal antibodies, there were also methods such as affinity maturation, humanisation and immunogenicity prediction algorithms.
However, to help answer directly what variation of our biotherapeutic we should make and test next, two recent AI methods, RFdiffusion and ProteinMPNN, have totally changed the nature of biotherapeutics discovery. These tools have the potential to change the way we design biotherapeutics by helping to identify novel candidates that the computational and molecular biologists may not have considered.
Generating proteins with AI: RFdiffusion and ProteinMPNN
RFdiffusion is a cutting-edge generative AI algorithm that can ‘diffuse’ a collection of amino acids into a protein structure. The diffusion process starts with a random, noisy collection of atoms and, through a series of controlled refinements, the algorithm adjusts the structure to reduce the noise and move closer to a biologically realistic and functional protein structure. One common analogy for the diffusion process is developing a photo from a blurry image; iterative processing steps can take an initial grainy image and refine the detail and clarity to produce a final clear picture.
RFdiffusion can be utilised to overcome several different biotherapeutic design challenges, such as engineering a biologic that can bind to a viral protein to neutralise the virus, or to generate enzyme therapeutics that may break down a specific substrate to treat metabolic disorders. Beyond biotherapeutics, the algorithm also has potential to help design proteins for industrial and biotechnological applications such as making enzymes that catalyse specific chemical reactions or proteins that suit very specific conditions including low or high temperature or pH.
ProteinMPNN, on the other hand, is a state-of-the-art neural network that can predict one or more probable protein sequences given a protein structure. This algorithm has been published with success in one of the most critical aspects of protein sequence design — generating sequences that fold into a stable protein/peptide with propensity to crystallise, facilitating the structure determination of these proteins.
One of the strengths of ProteinMPNN is its ability to generate multiple sequence variants, which is invaluable as different variants provide more options to test and identify candidates with the best performance in terms of efficacy, safety and manufacturability. Just as significantly, these variants also provide alternative leads when candidates encounter unforeseen issues in protein optimisation during protein expression, or ADMET challenges such as solubility and immunogenicity.
ProteinMPNN can be used in conjunction with RFdiffusion to generate new protein designs such as new enzymes or antibodies that can be further evaluated for desired properties such as stability, activity, affinity and specificity. Together, they significantly expand the biological space that can be explored in silico before biologists need to commit to expensive and time-consuming physical experimentation, opening up exciting avenues for more intelligent, model- and data-driven workflows driving innovation in biotherapeutic design.
Generating proteins with RFdiffusion and ProteinMPNN
Solutions are being made available to provide researchers easy access to RFdiffusion workflows. For example, BIOVIA offers a protocol that provides access to motif scaffolding, enabling users to start with a specific part of an existing protein (the motif) and design a complete new protein scaffold that incorporates this motif. This approach allows precise control over the functional regions of the protein, as well as control over the protein scaffold design, via different model weights that suit particular proteins and complexes.
BIOVIA also offers another protocol which allows users access to not only ProteinMPNN, where they can easily define sequence residues for design, but also LigandMPNN and SolubleMPNN models. LigandMPNN can consider protein, small-molecule, nucleic acid and metal ion ligands as additional context for designing sequences, while SolubleMPNN can be used when protein solubility is part of the design criteria of researchers. Ultimately, users can determine the degree of sequence diversity and confidence desired, and have the ability to control the bias of particular amino acids.
Such tools are examples of how we can expand the ever-growing arsenal of powerful AI tools for molecular modellers and biologists to help answer the question of ‘what to make and test next’ and accelerate the rational design of biologics. In combination with existing physics-based methods already available, researchers can rapidly explore many more possibilities in silico before arriving at the final handful of candidates that are ready to become a successful commercial biotherapeutic or a biological to be used in agriculture, food and beverage, or environmental industries.
For more information, click here.
Targeted therapy for childhood brain cancer shows promise
A potential new targeted therapy for childhood brain cancer is effective in infiltrating and...
Almost 300 genome regions increase risk of bipolar disorder
To help elucidate bipolar disorder's underlying biology, scientists conducted a genome-wide...
Gut microbes appear to regulate anxiety
In a germ-free environment, mice which were not exposed to live microbes showed significantly...