AI, big data and real world evidence – the challenges and opportunities

Views & Analysis
AI, big data and real world evidence – the challenges and opportunities

Real-world evidence (RWE) is emerging as an important area of research to reveal, in real time, new insights and innovations in medicine.

A recent pharmaphorum webinar, held in partnership with Savana and BREATHE – the Health Data Research Hub for Respiratory Health, heard that RWE is also accelerating the development of new and innovative therapies and treatments to improve patient outcomes.

But, until recently, an important data source has been locked up in the electronic health record (EHR), often in the form of unstructured or ‘free’ text entered by healthcare professionals.

This had previously been difficult to analyse and interpret for the purpose of health research. Now, artificial intelligence (AI) techniques such as natural language processing (NLP) and machine learning are helping clinicians unlock this valuable, unstructured information.

The webinar explored the potential of the technology to process the vast amount of data languishing untapped in EHRs, while ensuring that patient data privacy and safety is maintained at all times.

Opportunities are growing for real-world evidence research as the technology matures and a pioneering new partnership between Savana, utilising its AI-driven clinical research methodology, and BREATHE aims to drive innovative research using this new way of working.

The benefits of RWE

One of the main reasons to use RWE is to counter a common criticism of the large phase 3 trials that for so long have been the ‘gold standard’ when developing drugs and gaining authorisations from regulators.

Patients in the trials are carefully selected to make sure their disease is a good fit, but it is often said that they don’t effectively capture the kind of outcomes that will be seen in the real-world once a drug is launched in the general population.

Professor Jenni Quint, deputy director of BREATHE, outlined that much of the RWE in the UK comes from the electronic health record, particularly in primary care, where for some time every single encounter between primary care professionals and their patients has been recorded and stored.

“They are updated regularly and they cover longer periods of time. So not only are you able to get real-time updates in terms of what happens within clinical practice, you can also look at things longitudinally,” she said.

“Health data is essential for helping us deliver better patient outcomes and in particular with BREATHE in mind, we are really interested in improving the lives of people with underlying respiratory conditions.”

Great data, but difficult to find, access and use

While the UK has some of the best data sets in the world thanks to this electronic health record, Quint said there are challenges using it for research purposes, in particular where vital information is contained in an unstructured form.

“Data can be difficult to find and difficult to access and equally very difficult to use,” she told the webinar.

BREATHE and Savana are working together to overcome this with a new research methodology that uses de-identified patient data to provide insight into respiratory diseases using cutting-edge techniques.

The partnership draws on Savana’s AI technology and clinical research methodology which can analyse both structured and unstructured data in the free text of EHRs.

This can then be used to monitor disease progression and outcomes in patients hospitalised with COVID-19 and is combined with natural language processing (NLP), a machine learning analysis method.

Dr Ignacio Medrano, chief medical officer and founder at Savana, said: “We are now able to teach a computer how to read records and turn free text into a database of clinical variables.”

Data collected like this is also acceptable to regulators, pointed out Medrano, meaning that it can be used during the clinical development process.

It can overcome language barriers as a translation system has been built into the technology and it can be used at scale to pull together large amounts of data.

“Data can be difficult to find and difficult to access and equally very difficult to use"


Using real-world data in real life

Quint outlined how BREATHE is working with Savana’s AI technology to analyse both structured and unstructured data from hospitals records in the BigCOVIData study.

“The BigCOVIData study will allow us to predict COVID-19 progression and outcomes, helping hospitals to prepare for and monitor future waves of the virus,” according to Quint.

She went on to describe another study, EAVE II, which has drawn on BREATHE’s expertise, linking patient data across the entire Scottish population to provide the first real-world evidence of the success of the COVID-19 vaccination programme in cutting risk of hospital admission.

Prospective RWE studies in people who are at risk of developing a condition, allow researchers to gauge the impact of a therapy at a population level and are becoming increasingly common.

They allow them to study the development of disease in a group of patients that are more representative of the general population than in a typical clinical trial, giving a more nuanced view of the performance of the therapy’s benefits and drawbacks.

Patient-friendly studies

Professor Nawar Bakerly, a consultant respiratory physician at Salford Royal NHS Foundation Trust, referenced the ground-breaking Salford Lung Study, one of the first large-scale RWE-based studies, to describe the challenges and opportunities of collecting data in these sorts of studies in the NHS.

The study measured the impact of next generation inhaled lung drugs on COPD and he said it had a lower drop-out rate than other similar studies. Just 7% of patients left the study, compared with up to 44% in another COPD project, thanks to a study design that relied on evidence collected from electronic records rather than by requiring patients to attend hospitals or clinics for tests.

“It’s less cumbersome on patients - we don’t have to bring patients for these frequent visits that they have to attend in the usual way with randomised controlled trials.”

Bakerly also noted that the separate OpenSafely initiative had been used to interrogate NHS data to gain insights into COVID-19 related deaths in the pandemic.

While the focus of the webinar was on respiratory diseases, RWE evidence can be used in other illnesses too, with potential for the approach to be used to improve care in areas such as rare diseases, according to the panellists.

The challenge with rare diseases is to link up data from patients scattered across the globe, something that was extremely difficult without AI techniques, they said in a concluding question and answer session.

Savana’s Ignacio Medrano said that the company was already active in this area and has been able to conduct ‘deep screening’ for patients who are undiagnosed.

The system can put a ‘red flag’ on the patients who could benefit from a test, although the system is not intended to make a direct diagnosis.

“We are doing that with a good number of rare diseases already both in respiratory diseases and neurology with different life sciences companies in different countries.”

Wearables, smartphones and apps could also be used to gather RWE in studies involving round-the-clock monitoring of patients.

BREATHE’s Jenni Quint said that the power is not just in information from wearables but the ability to link them with other data.

This idea of mixing information from other sources, such as pollution data, could give insights into causes of disease and potential therapies.

“Every single piece of data adds to the bigger picture,” Quint concluded.

About the webinar panel

Ignacio MedranoDr Ignacio H Medrano, CMO and founder, Savana is a consultant neurologist with training in healthcare management and experience in clinical research strategies - formerly responsible for +500 researchers. A Singularity University graduate, he is also a founder at Mendelian in the UK which is utilising AI in the diagnosis of rare diseases. Ignacio is in demand as an international speaker at digital health, clinical research, science and technology events and congresses.

Jennifer QuintProfessor Jenni Quint, deputy director, BREATHE – the Health Data Research Hub for Respiratory Health, is currently a Professor of Respiratory Epidemiology at the National Heart and Lung Institute (NHLI), Imperial College London and an Honorary Consultant. She currently also leads a clinical epidemiology research group, partners with the Royal College of Physicians and is Analysis Lead for the National Asthma and COPD Audit Programme.

Professor Nawar BakerlyProfessor Nawar Bakerly, consultant respiratory physician for Salford Royal NHS Foundation Trust Professor Nawar Bakerly MD, FRCP is a consultant respiratory physician at Salford Royal NHS Foundation Trust and visiting professor at Manchester Metropolitan University. He is also clinical director for the Department of Respiratory Medicine (Pulmonology) and Chief Clinical Information Officer (CCIO) for Salford Care Organisation – as well as being its lead for integrated COPD services. He has collaborated as lead investigator or co-investigator in clinical trials, including the Salford Lung Studies for asthma and COPD, and also works on various clinical advisory groups for NICE and the NIHR in the UK.

About Savana


Founded in 2014, Savana is an international medical company that has developed a scientific methodology that applies Artificial Intelligence (AI) to unlock all the clinical value embedded within the free-text of Electronic Health Records (EHRs). With the largest AI-enabled, multi-language, multi-centre research network in the world, Savana generates customised descriptive and predictive, Deep Real World Evidence research studies. Savana is built following the highest privacy-by-design standards and with medical reliability engineered by doctors for doctors. Savana constitutes a clinical research ecosystem that aims to advance personalised and precision medicine worldwide.


BREATHE facilitates the safe and responsible use of respiratory health data at scale, sparking research and innovation for the benefit of UK patients. BREATHE is a collaboration with patients and publics, universities, third sector organisations and industry from across the UK and globally. The Hub is led by The University of Edinburgh, Imperial College London, University of Leicester, Nottingham University Hospitals NHS Trust, Queen Mary University of London and Swansea University. Launched in 2019, BREATHE is one of seven Health Data Research Hubs across the UK. Coordinated by Health Data Research UK, the Hubs are part of a four-year £37.5million investment from UK Research and Innovation's Industrial Strategy Challenge Fund.

25 May, 2021