Can Big Data really transform healthcare?

Andrew McConaghie

Like most sectors, pharma periodically discovers new buzzwords and hot topics which become inescapable for 2-3 years, and begin to sound like the answer to every question.

At the moment, that buzzword is ‘big data’ – but while the hype can sometimes prove overwhelming, the future implications of big data are undoubtedly revolutionary.

Big data refers to the explosion of data generated by the new digital world, where everything we do leaves some kind of electronic record, thereby opening up the potential for big data analysis – which in turn could help us improve efficiency in all walks of life.

The potential for technology and big data to improve healthcare is enormous – from understanding the origins of disease, to better diagnosis, helping patients monitor their own condition, and analysing how effective healthcare interventions are.

Perhaps the most remarkable harbinger of the big data future is IBM’s supercomputer Watson. Last year, IBM announced that Watson had ‘learned’ all current medical knowledge relating to lung, prostate and breast cancers; this was achieved by assimilating knowledge from 600,000 pieces of medical evidence, two million pages from medical journals and 1.5 million patient records. Calling on this huge bank of data, IBM reported that Watson achieved a successful diagnosis rate of 90% in lung cancer – far outstripping the 50% average rate for human doctors.

At last week’s Pharma Summit organised by The Economist, almost every speaker touched on the subject of big data, and news of the latest developments showed that the big data revolution is already gathering pace.

Data to prove effectiveness and cut costs

It is easy to argue that ‘big data’ has been present in healthcare and the pharmaceutical industry for some time, and that in fact that other walks of life are just catching up. The industry’s clinical trials generate huge amounts of patient data, which are now being submitted electronically to regulators and health technology assessment (HTA) bodies such as the UK’s NICE.

As the ability of regulators and HTA bodies to analyse and compare clinical trial data has grown, they have demanded more and more evidence of the value of drugs from pharma. One of the most talked about developments in this field is ‘real world evidence’ – the idea that healthcare systems want to see proof that drugs work in real populations, not just the randomised controlled trials (RCTs) of the industry.

Now new technologies and approaches to R&D are making this a possible.

Head of Pfizer’s new Global Innovative Pharma division Geno Germano said the industry now had to put forward a much stronger message about the value of its medicines, and said he believed real world data capture could tell us where the ‘value pockets’ are.

“Big data can help us personalise medicine, and also glean new insights from big numbers of patients in the real world like never before,” he said.

Some companies are already making progress on integrating real world studies into their development programmes. GSK has just launched a real world study in Salford in the UK, studying 4,000 patients with COPD and 5,000 patients with asthma in the year-long study of its new respiratory drug.

“Big data can help us personalise medicine, and also glean new insights from big numbers of patients in the real world like never before,”

GSK says it is the first time a large, prospective, real-world study has been performed on a pre-licence medicine across a large population in one geographical setting, and the company hopes it will generate useful data for healthcare regulators and payers.

Geno Germano also indicated that the opening up of data in pre-clinical research, and the more open sharing of data between academia and the industry, could also help cut development times.

He called for an ‘audacious’ goal of cutting R&D cost by 50% in three to five years, and shaving 12-18 months of the average drug development time. Many other speakers agreed that big data could indeed help cut costs, such as identifying earlier when to cancel unpromising molecules before they reach phase III trials.

Precision Medicine

Sanofi’s head of R&D Elias Zerhouni outlined what he calls ‘Precision Medicine’ which can be summed up in four P’s: Personalised, Predictive, Participatory, and Pre-emptive.

Zerhouni said the pharma industry had gone down the wrong route in previous decades, thinking it could understand human biology by using animal models, and cocooning its scientists away in isolation instead of sharing data between organisations. He said researchers had badly underestimated how complex human biology is, and were only now starting to unlocking its secrets.

He said there needed to be a new R&D paradigm, and that would involve mobile health technology ‘mhealth’) wearables and biomarkers.

Zerhouni said these technologies would help collect data to help prove the value of medicines in a far more precise way.

“We cannot currently measure the impact of our drug on the degree of tremor in Parkinson’s disease – but the technology to do this does already exist,” he said.

“We should be monitoring the course of a disease as frequently as possible, and would be able to assess after five years if the disease has changed; diseases are dynamic, there is nothing to say that what worked at point X will work at point Y.”

Clinical trials remain

The huge potential of big data and real world studies raises the question of whether the randomised controlled trial will remain the gold standard for approval in drugs.

Opinion was divided at the conference – some, like EFPIA’s Richard Bergström said that the place of RCTs at the top of the clinical evidence hierarchy was now being challenged. Martin Coulter, chief executive of PatientsLikeMe, which collects patient-reported data was, not surprisingly, even more evangelical about non-RCT data. However most of the other speakers agreed that the clinical trial would remain at the core of pharma’s evidence base.

Michael Simpson, chief executive of US-based healthcare analytics firm Caradigm said: “I think there will always be clinical trials. The thing about real world data today is that it is ‘dirty.'” He added that there were ‘tens of thousands’ different ways of coding diseases in healthcare IT systems, and that was a major barrier to managing and understanding healthcare data, and this would have to be standardised on a global basis over time.

He concluded: “Healthcare data is the most complex data in the world, bar none, period.”

Nevertheless, the advent of big data will mean that real world data will play an increasingly important part in determining safety, efficacy and value to patients, and is likely to run in parallel or cross over with RCTs.

The patient perspective

Social networks and the rise of patient power are now influencing healthcare enormously. In the near future, patient experience and opinions will be fed into the mix of healthcare big data, and could directly influence how healthcare is delivered.

PatientsLikeMe is one of the most notable examples of this, and has created communities of patients with the same condition, who have generated their own pools of patient experience online. Once again, the significance of this data is sometimes overstated, but it is undoubtedly a powerful addition to the mix of data influencing research and healthcare policy.

Data is now also being gathered from patients in healthcare systems to provide accurate feedback on performance. In England, Patient Reported Outcome Measures (PROMS) have been introduced in the last few years to track how patients who have had surgery rate their own health and the effectiveness of the operation. Healthcare systems around the world are slowly converting to electronic patient records, which will have a major impact on how a disease is monitored at an individual level and at the population level.

These initiatives, along with ‘TripAdvisor’ type reviews will slowly but surely make healthcare systems more transparent and accountable to patients.

Big Data, big burden of proof

Bruno Strigini, President, MSD, Europe and Canada reinforced the need for pharma to be willing to explore new approaches based on health outcomes.

He highlighted how regulatory requirements have grown over time, and predicted that HTA bodies in Europe would converge, just as EU regulators had in the form of the EMA. He said this would force pharma to change its approach.

“Real world evidence – it’s the new reality, we need to partner more with the new stakeholders if we want to bring the right products to patients” said Strigini, adding that it was pharma’s priority to share data with stakeholders to ensure a mutually successful future.

For pharma, there are clear dangers as well as possibilities in this future scenario of a more precise understanding of diseases and medicines: big data can show the effects of a drug to be higher than hoped, but equally it could be lower – making it easier for healthcare payers to refuse reimbursement.

It was therefore reassuring for the industry audience to hear Sir Andrew Dillon, the chief executive of NICE, acknowledge these problems.

“We need to find a different way of managing entry of new drugs which restructures risk,” he said. “It is neither fair for the healthcare systems to take on all that risk, nor is it fair the industry to take it on.”

When asked if he believed if cutting costs in medicines would be the main way health systems balanced their books, he said “the problem is too big” to rest on one area of expenditure. In common with Sanofi’s Elias Zerhouni speaking earlier in the day, Dillon pointed out that the vast majority of healthcare costs were ‘locked up’ in the healthcare professionals and staff needed to run the systems. He said it was the deployment of these human resources which could yield the biggest savings.

“We need to find a different way of managing entry of new drugs which restructures risk”

Responding to questions about pricing and access, Dillon called on pharma to work with the NHS and other healthcare systems to track what happens to a drug inside a healthcare system once it is approved, to establish if there is appropriate uptake and use.

He said this was quite possible – Dillon said the NHS was now good at coding healthcare interventions for internal financial and billing purposes, and saw no reason why this couldn’t be done for uptake and clinical use.

However top of Dillon’s ‘wish list’ in NICE’s relations with the industry was full data transparency – pharma is not currently obliged to submit all data to NICE, and this has caused major disagreement over the years.

Indeed the whole issue of data transparency in pharma clinical trials was skirted around by most speakers, despite it being directly relevant to the question of big data.

Data hugging and bad data

NICE is one of the many signatories to the AllTrials campaign for full pharma data disclosure, which was launched last year, and to which GSK and Janssen have now signed up. The AllTrials movement looks likely to win over the industry sooner or later – even if some come kicking and screaming.

Several speakers were on hand to testify that it not just pharma which remains secretive and unnecessarily possessive about its data. John Parkinson, director of the UK’s renowned patient records database the CPRD said ‘data hugging’ was a noticeable problem in academia as well as pharma.

“The AllTrials movement looks likely to win over the industry sooner or later – even if some come kicking and screaming.”

The issue of quality of data was once again raised – the old adage of ‘junk in, junk out’ summing up the problem that good decisions cannot be made based on flawed or incomplete data. One recalls in particular, IBM’s supercomputer Watson, which can only base its diagnoses on the evidence given to it.

Richard Price is chief executive of a new venture called, which highlights the problem of poor levels of ‘reproducibility’ in academic research papers published in peer review journals. He pointed to reviews which found that the overwhelming majority of papers in top journals were not reproducible – suggesting that either the explanation of the science was incomplete, or that the science itself was flawed. Price said that most published papers did not supply the full study data set, mirroring the problems seen in pharma’s ‘data hugging’ habits.

This implication of this is serious, as pharma increasingly looks to alliances with academic centres to identify the most promising areas for new research.


It is clear that big data will increasingly play a big role in shaping healthcare – however this is not a future where decision-making is always made easier or quicker by technology. On the contrary, the explosion of data is making life more complex, at least for the meantime, and collecting, sorting and storing it will be a time-consuming and costly task in itself. And subsequent analysis and decisions will undoubtedly present new scientific, ethical, and financial dilemmas as well as opportunities for major progress.


About the author:

Andrew McConaghie is an experienced journalist and pharmaphorum’s new Managing Editor, Feature Media. He has been writing about the pharmaceutical industry and NHS since 1999 and will be writing regular exclusive news and insights from the sector for pharmaphorum.

Andrew can be contacted at