AI doesn’t have to be 100% reliable to be useful

Digital
Photo courtesy Frontiers Health

EVERSANA SVP of Innovation Abid Rahman speaks with Sanofi's Monique Levy on stage at Frontiers Health 2025 in Berlin.

Generative AI models aren’t perfect. Despite constant improvement, they still hallucinate and make mistakes. And for a high-stakes, risk-averse field like healthcare, that’s a scary thought. But it doesn’t have to be.

At a packed session at Frontiers Health, a panel of AI experts talked about how the right strategy can leverage AI’s strengths while mitigating its weaknesses.

Still in early days

Panellists agreed that while AI can get things wrong, it’s not as devastating a problem as some make it out to be, especially given where we are in the life cycle of this new technology.

“There's so much jumpiness around ‘It's failed.’ All these reports that are coming out, 76% of things didn't show an ROI,” said Monique Levy, Sanofi’s head of North America specialty care strategy and operations. “We're not even embryonic, but there's such a trigger reaction to, 'Look, it hallucinated. I told you it was bad.' I think being able to have a continuum of what to expect, what I need to know, what the curve and the journey will look like - there's a lot of just common sense stuff, which in the hype moment, we forget about.”

"We can look to the past for some reassurance", said Greg Ruslik, chief data and AI officer at Stellarus.

“We had the same conversations when Google came out,” he said. “How do we use it? How do we train on it? How do we know it's the right data? How do we trust it? Wikipedia. I mean, do you remember Alta Vista?”

In some ways, AI is a victim of its own success, according to James Kugler, CEO of EMD Digital at Merck Gmbh.

“When GPT-5 came out and everyone was up in arms at the accuracy, the accuracy had actually improved tremendously,” he said. “Part of the issue is that 50% error is better than 1% error, because if you stop checking the work because it's right more often, then you're more surprised when it's wrong. … So there are these weird tensions that form. Until you're at zero errors, we're not going to fully be super comfortable with what's there. And you’ve got to think about how pharma deals with it. Pharma's error tolerance is very small.”

New technologies, Old strategies

The problem of dealing with AI hallucinations isn’t as new as it seems, Ruslik said, so it doesn't need all new solutions.

“Data validation isn't a new thing,” he said. “Machine learning existed more than three years ago. And I hate to tell everyone, but statistics has existed for a while, too. And a lot of people have given this thought. So best practices that were true 30 years ago are still to some extent true today. Proper holdouts, proper cross validation, God forbid, talking to a statistician. … None of that has changed. One of the big things that has changed is the size of the data. The models have gone from 20 parameters, 30 parameters to 10 billion parameters, which requires a larger scope. But that doesn't change the fundamentals of it.”

In general, processes that have served industry well in the past will continue to do so.

“We know how to do this,” said Levy. “There's an established product strategy, framing that you use in software. It's a well-understood discipline, and you just have to really keep pushing business leaders … and then you have to have partnerships across different parts of the business to drive that to push on each other. What's the right tech? What's the right question? And then drive through a typical product development process.”

Training the human in the loop

Part of having the right processes in place is training workforces to use the tools in a way that makes sure errors are detected.

“It's also just having a trained workforce knowing what to do, and also where it generally works, and where regular machine learning works, and where it doesn't. And like any tool, it has its flaws,” said Ryslik. But, he added, this is a temporary problem – the next generation will grow up with an intuitive sense of when to trust these tools and when and how to verify the information.

“Ultimately, these are all tools, AI systems, whether it's a large language model or any of the AI systems, they're all tools,” said Abid Rahman, EVERSANA’s SVP of innovation. “It's a matter of processes and whether the the people who use it actually are robust in using these systems.”

Fit for purpose tools

Beyond people and processes, innovators can also be smart about what use cases they use AI for and which AI systems they leverage.

“I don't believe any large AI system will ever be able to deliver a medical device or an AI model that is medically certified,” said Bart De Witte, founder of Isaree. “We need to start looking beyond that at how we can start building these smaller agents, specialised agents that are very accurate in their single task. And now we can start orchestrating, which is a very different approach.”

Additionally, industry can identify areas where accuracy is more or less essential.

“In some cases, you don't need to have 100% accuracy,” De Witte said. “If it's about filling a document that takes a doctor three hours, then we can automate 80% of it as part of a utility move.”

Ultimately as the models get better and better, inaccuracy will become a smaller problem. But even if it never goes away entirely, it doesn’t need to hold the industry back from exploring AI’s full potential.