The impact of generative AI on protein design for drug discovery and beyond

R&D
Conceptualisation of AI in biopharma

Artificial Intelligence (AI) is at the heart of the latest global transformation, not least within the biopharmaceutical industry.

AI is already rapidly redefining how we discover and design biologics, and as CTO at Cytiva, I’ve had the privilege of witnessing firsthand how these technologies are reshaping this landscape. I’m excited by the prospect of translating these tools to manufacturing. What we will see is not just incremental progress, but a foundational shift toward a more predictive and scalable manufacturing to deliver breakthrough therapies to patients.

Taking structure into new horizons: AI’s role in drug discovery

Recent breakthroughs in AI have dramatically accelerated drug discovery, particularly in the realm of protein engineering. Tools like AlphaFold have revolutionised our ability to predict protein structures with remarkable accuracy. These models allow researchers to understand how proteins fold and interact – insights that were previously locked behind years of experimental work.

But the real power of AI lies not just in prediction, but in exploration. Generative AI (GenAI) is opening vast new opportunity spaces, enabling scientists to design novel proteins that may never have been discovered through empirical methods alone. By generating thousands of potential candidates in silico, researchers can evaluate a broader spectrum of possibilities without the time and cost constraints of traditional wet lab approaches.

This capability is especially critical as therapeutic modalities diversify. Whether it’s monoclonal antibodies, bispecifics, or entirely new protein scaffolds, AI allows us to move beyond known paradigms and into the “white space” of molecular design. It’s not just about speeding up discovery – it’s about expanding the boundaries of what’s possible.

Beyond discovery: First steps into bioprocessing

While AI’s impact on drug discovery is well recognised, its influence on bioprocessing is just beginning to unfold. The same insights that help us design better proteins are now being applied to improve how those proteins are produced at scale.

Understanding biophysical characteristics at a molecular level can allow us to predict manufacturability – how easily a protein can be expressed, purified, and formulated. This is a critical step in biologics development, where the transfer from lab to production environment can derail even the most promising candidates. By expanding AI enabled discovery tools to consider the relationship with manufacturability, we can optimise expression systems, reduce downstream risks, and accelerate timelines.

Tackling biological uncertainty

One of the most persistent challenges in bioprocessing is the inherent unpredictability of biological systems. Cells are complex, dynamic entities, and their behaviour can vary significantly depending on processing conditions and genetic modifications. This variability introduces uncertainty into development and manufacturing, requiring extensive experimentation during process development to ensure control and reproducibility.

Mechanistic models have long been used to predict operations such as chromatography and filtration performance in the process industries. But the complexity of biological systems hinders application of such mechanistic methods. The current solution to building an understanding of these “black boxes” is to use statistical models built from a Design of Experiments approach (DoE). However, the design space that can be explored with such methods is limited due to the high level of wet lab resources needed for the controlled linear experimentation needed. This leads to testing being limited to the highest contributors to the outcome, typically focusing on the edges of the potential operation space. The resultant models use simple “curve fitting” to predict relationships, potentially missing out on more complicated multifactor interactions and are unable to predict performance outside of the initial design frame. These are therefore oversimplified models that do not give a true approximation of the complexity of the dynamic biological systems they represent. AI can be the game-changer here.

The vision: Fully in-silico process development and true digital twins

Imagine a future where clinical development – including CMC – is conducted primarily in-silico. Where AI models predict not only protein binding, but also metabolic pathways, cell population dynamics, and purification outcomes. Where digital twins control everything from development to large-scale process control.

In the future, it will be fascinating to see innovations such as Graph Neural Networks applied to the understanding of relationships in these biological manufacturing processes with a high number of interacting factors. Loaded with a sufficient corpus of data for training, testing, and validation such models could produce a level of predictability not possible today. Could this mean we can remove the need for process development? Then, if such models could be combined with sensor technologies and AI-driven analytics, we might move from digital shadows towards true digital twins that provide real-time control of critical quality attributes (CQAs). The future would see process run at the true global optima without deviation.

This vision is closer than many realise. Alphabet’s DeepMind has already announced its ambition to build a virtual cell, with Nobel laureate Demis Hassabis suggesting it could become reality within five years. We are seeing this race start to heat up with Biotech Startup Tahoe Therapeutics raising $30 million to build AI models of cells.

To reach this future, we must overcome several foundational challenges. Data quality remains a critical issue – AI models are only as good as the data they’re trained on. Model interpretability is another hurdle, especially in regulated environments where transparency is essential. And integration into existing workflows requires thoughtful change management and cross-functional collaboration.

Yet, the potential rewards are immense. As Regina Barzilay and her co-authors wrote in their paper ‘On the Opportunities and Risks of Foundation Models’ (arXiv:2202.05146), "The transformative potential of AI hinges on foundation models – large-scale models trained on broad data that can be adapted to a wide range of downstream tasks. These models mark a shift in the AI paradigm, emphasising the importance of scale and pre-training in achieving performance across diverse applications.”

Solving the “low n” problem in bioprocessing

One of the biggest barriers to applying AI in bioprocessing is data scarcity. While each manufacturing run generates vast amounts of data, the number of runs – especially during development – is relatively small. This low “n” problem limits the statistical power of traditional models and makes it difficult to draw reliable conclusions.

AI can help in two key ways. First, it can contextualise and integrate disparate data sources, breaking down silos across development, manufacturing, and quality control. Second, it can enable the creation of foundational models trained on aggregated historical data, which can then be fine-tuned for specific applications.

This approach mirrors the success of large language models (LLMs), which are trained on massive datasets and then adapted to individual use cases. In biomanufacturing, where proprietary data and privacy concerns are paramount, federated learning offers a potential solution. In this model, data remains local, but contributes to a shared model – allowing the industry to collaborate without compromising confidentiality.

Federated learning could enable the development of universal bioprocess models that can be customised to individual facilities, products, and modalities. This would dramatically accelerate development timelines and improve scalability across the industry.

A call to action: Embracing AI in bioprocessing

As we stand at the intersection of biology and computation, the opportunity before us is clear. AI is not just a tool – it’s a strategic enabler of the next generation of biopharmaceutical innovation. Realising its full potential requires a shift in mindset.

Bioprocessing must radically adopt these tools to meet the demands of speed, flexibility, and personalisation. Whether it’s achieving the bold goal of reducing the current price of global monoclonal antibody (mAb) use by five times, or enabling the fastest-ever clinical development, AI will be central to the solution.

As the biotherapeutics market diversifies – with modalities like mRNA, CAR-T, and personalised vaccines – there will not be one process of the future, but many. AI will be the common thread that enables agility, scalability, and precision across this complex landscape.

By combining empirical research with in-silico innovation, we can build a future where groundbreaking treatments are developed with unprecedented speed and accuracy, tailored to each patient’s unique physiological profile. It’s a future where patients benefit from faster access to life-changing therapies, and where science and technology work hand-in-hand to solve the world’s most pressing health challenges.

About the author

Image
Beate Mueller Tiemann

Dr Beate Mueller-Tiemann’s career has reflected the entire pharma value chain from early target discovery, lead generation, lead optimisation, cell line development, chemical manufacturing and control (CMC) production and commercial manufacturing for both biologics and synthetic molecule therapeutics. Mueller-Tiemann joined Cytiva in early 2023, having held senior leadership positions at Sanofi and Bayer. The company’s chief technology officer, she is a former athlete who believes in setting the pace and measuring performance. She finds inspiration in her diverse teams, when she sees the power of differing perspectives, experiences, opinions, and ideas being unleashed. Mueller-Tiemann earned her PhD at the Leibniz Institute of Virology in Hamburg, Germany in the field of Molecular Oncology. She holds Master’s degrees in Biology and Biochemistry from the University of Bielefeld, Germany, and the University of Montpellier in France. She is an elected member of the Board of the Dechema, German Society for Chemical Engineering and Biotechnology. She is also a member of the board of trustees of the Fraunhofer Institute for Microengineering and Microsystems IMM in Mainz. Mueller-Tiemann supports female talents with mentoring programmes and is a member of FiDAR, an association that works towards increasing the representation of women on German corporate boards.

Image
Beate Mueller Tiemann
profile mask
Beate Mueller-Tiemann