BioGPT: A game-changer for pharma and healthcare

AI language model

ChatGPT has already made waves and has been deployed to write codes, new poems, songs, recipes, and whatnot. Language models that use transformer-based architectures, such as GPT (Generative Pre-trained Transformer), facilitate the analysis of large, complex datasets and generate human-like responses to questions.

Microsoft recently released a new AI language model, BioGPT, that is specifically designed for the life sciences industry. The model has been trained on a diverse set of biomedical text data, including scientific publications, clinical notes, and drug labels, making it an invaluable tool for scientists across various domains within the life sciences field.

Why is it so important?

Compared to GPT models that are trained on more general text data, BioGPT has a deeper understanding of the language used in biomedical research and can generate more accurate and relevant outputs for biomedical tasks, such as drug discovery, disease classification, and clinical decision support. BioGPT is also able to capture the nuances, subtleties, and syntax of the biomedical language, such as differentiating between drug names, gene names, and protein names, which is essential for many biomedical applications.

For example, here is the output of both for a scientific query:

the output of both for a scientific query

Source: (Luo, R, et al., 2023)

In the past few years, there has been extensive usage of knowledge graphs (KG) that connect concepts from diverse sources into a multidimensional network for understanding biomedical concepts. However, building large KGs is difficult, due to data integration and scalability challenges. While KGs provide a highly structured way to look at data, contextual understanding is limited. Such a language model can address some of the limitations of KGs and provide an easier way to understand biomedical contexts. 

What applications can it power? 

  1. Drug discovery: BioGPT beat all previous language models for relationship extraction between entities (drugs, disease, and proteins). BioGPT can assist in automating the analysis of the ever-expanding body of scientific literature to understand disease mechanisms better and identify potential drug targets. 
  2. Precision medicine: It involves tailoring medical treatments to the specific needs of individual patients based on their genetic makeup, lifestyle, and environmental factors. BioGPT can help researchers identify genetic mutations, disease pathways, and other relevant information from large datasets, enabling the development of personalised treatment plans for patients. 
  3. Improving drug safety: BioGPT beat other models at predicting drug-drug interactions and can help clinicians to predict the potential side effects of drug combinations and improve drug safety.
  4. Clinical trial design and analysis: BioGPT can be used to extract and analyse data from clinical trials, helping researchers design more effective trials, and analyse trial results more accurately.
  5. Competitor analysis: BioGPT can be used to analyse scientific literature and patent databases to identify potential competitors and assess the competitive landscape.
  6. Scientific communication: BioGPT can be used to generate summaries of scientific literature and other sources of information, making it easier for business development professionals to quickly understand and communicate key insights.
  7. Disease diagnosis and management: BioGPT can be used to analyse patient data, medical records, and scientific literature to help diagnose and manage diseases more effectively.
  8. Medical writing, education, and knowledge sharing: BioGPT can be used to develop educational materials and assist healthcare professionals in keeping up to date with the latest research and clinical findings.

There can be so many additional use cases ranging from virtual healthcare assistants to medical recordkeeping and translation, remote monitoring, etc. 

What are the limitations?

While BioGPT is designed to process biomedical literature, it shares some of the limitations of other generative language models, including ChatGPT and AI more generally.

Moreover, there are concerns that such models can generate inaccurate information without any supporting evidence, potentially leading to the spread of misinformation. Furthermore, as BioGPT is trained on existing medical research, it may inherit any biases present in that literature, potentially perpetuating these biases in its recommendations and predictions. 

The future is bright

The potential of BioGPT to integrate with current tools is vast. It can be integrated into existing software platforms to enhance their capabilities. Within two months of its launch, several companies have already started to integrate BioGPT into their tools. For example, Insilico Medicine recently announced the use of BioGPT in its Pharm.AI platform for drug discovery.

We can expect future bots and search engines that help scientists with day-to-day understanding of research through simple questions and answers, instead of showing articles to read. It can be integrated into electronic medical records (EMRs) to assist in diagnosis and treatment. It can also be integrated into laboratory information management systems (LIMS) to assist in data analysis and interpretation.

While BioGPT is in an early phase of development and has some limitations, it has the potential to unlock a new era of discovery in life sciences research, where the boundaries of what we can understand and achieve in fields such as genomics, proteomics, and drug development will be pushed to unprecedented levels.

About the author

Amandeep SinghAmandeep Singh is a senior consultant at MP Advisors, a biopharma-only strategy and financial advisory firm. He brings years of experience in growth strategy, AI adoption strategy, and deal advisory, helping biopharma companies to expand through organic and inorganic opportunities across many therapeutic areas. He additionally has deep expertise in guiding PharmaTech start-ups across AI for drug discovery, NLP, and digital health for product and business strategy.

profile mask
31 March, 2023