As pharma deploys AI across the value chain, data remains a challenge

(R to L) EY's Ricardo Vilanova, Lila Bioscience's Molly Gibson, Merck's Iya Khalil, Insitro's Mary Rozenman, and Gallop Oncology's Luba Greenwood at BIO 2025.
When Google shook the foundations of the biotech world with AlphaFold in 2021, it seemed like an AI revolution in drug development was just around the corner. But the rich data set Google had to work with around protein folding has turned out to be an exception, not a rule.
At BIO 2025 in Boston today, pharma and biotech leaders discussed the current state of AI in the drug development life cycle and some of the current slowdowns and challenges.
“Why were proteins where we first saw the big breakthroughs with AlphaFold? Because we had decades of data lying there to apply AI to,” Molly Gibson, co-founder and president of Future Science at Lila Biosciences, said on a panel at BIO’s AI Summit. “What we're seeing now is a moment in time where we're hitting a data wall. The places where we have the data, where we've curated the data, we're seeing great performance. The places where we don't have the data, things like clinical trials, things like human trials, things where we have to actually get into the real world, are places where we either have to figure out how to get the data or we have to innovate in new ways.”
That challenge, of finding or building good, usable data sets, is also one of the biggest differentiators in an increasingly crowded biotech AI market, presenters agreed.
“In terms of the nature of the AI, ultimately it comes down to the data that the model is trained on. Where is that data coming from? What's the quality of that data? What's the performance of the model? How do you define performance?” said Mike O’Brien, partner at Valspring Capital. “There're lots and lots of large players in the AI space, but if you can differentiate yourself by being very niche-y and having the data to back it up, that's going to be very appealing to an investor and, frankly, big players that are in your space will look at you more as a large investment or an acquisition versus a competitor that's just going to take you out tomorrow.”
Health investor and Gallop Oncology CEO Luba Greenwood suggested that this dearth of data is also an opportunity – if not a duty – for big pharma.
“I think this is where pharma can lead the way,” she said. “What are you guys cooking with that clinical data?”
AI across the value chain
As AI becomes more prominent across all industries, pharma and biotech companies are looking to use it across the organisation to maximise its efficiency gains.
“If you look at the entire R&D value chain, there're just tonnes of steps of data managing, protocol generation, searching for information, all of that leading up to an IND filing, and then when you're past IND filing and into clinical trial setup, tonnes of looking at large amounts of data,” said Vega Shah, PMM lead for healthcare and life sciences at NVIDIA. “We're definitely seeing LLMs being helpful there.”
Iya Khalil, VP and head of data, AI, and genome science at Merck & Co, observed that using AI in many different areas necessitates having many different AI teams with specific skills.
“Folks that need to focus on AI for target ID might be a different team then you're going to need to bring in for the AI for chemical and molecular design, and a different team than the one that's going to focus on clinical trial optimisation,” she said. “And that's how you have to think about it. Design your organisation to have AI embedded in every part of the relationship so that end-to-end you're getting those speed-ups. We're seeing that projects are going from what might take months in terms of molecular design to weeks, and that's a big deal.”
However, a lot of discrete AI tools can also be a problem if they can’t be harmonised into one workflow.
“I think we see a lot of one-off AI tools, but then chaining them together becomes really difficult in practice, making them into an operational AI system is still a challenge we're facing every day,” said Yue Webster, VP of model-driven drug discovery at Lilly. “So at Lilly we're tackling this from the other way. How can you optimise AI solutions for the whole life cycle of a molecule, not just binding affinity? Investors are really looking for operational transformative companies disguised as AI companies.”
A shift in collaboration
As AI becomes more and more important to pharma companies, it’s driving a change in how they work with their technology partners.
“I see a big fundamental shift in how we see collaboration,” said Webster. “Instead of seeing an AI company as a service provider, we see them as a close collaborator. We want them to be integrated into our drug discovery team […] That leads to a frustration. There're still a lot of companies trying to sell a Swiss army knife. I'm looking for a company to solve a really narrow problem really well, rather than trying to solve everything.”
For biotech, there’s an opportunity to be a closer partner as well – but once again, the key is data.
“Biotechs and start-ups are really the birthplace of innovation,” said Ashoka Madduri, the head of scientific strategy and intelligence at Sanofi's mRNA Center of Excellence. “I think big pharma is moving beyond just licensing deals to more integrated, long term collaborations, to be at the forefront of innovation, to really be a partner to bring these technologies to patient. For drug discovery companies the key is really to have that comparative data.”
The tortoise, not the hare
While a lot of the focus of the AI conversation is around time saving and efficiency, it’s important not to become too focused on speed.
“For those of you who have worked with machine learning, when you work with these computational tools on biologic or chemical data, the thing is AI is lazy. It will give you an answer. It knows you want an answer and it will give it to you. Our job is to have processes in place to make sure those answers are real,” said Insitro CFO Mary Rozenman.
“You need to continue to be diligent, you need to continue to be rigorous, because these are still therapeutic programmes at the end of the day. It's an error we've been making as an industry, over-focusing on speed. This idea that AI should make things faster, well let's just run as fast as we can. No - let's advance the highest-quality programs that we can and that's ultimately what is going to be the step function, the step change.”