DeepMind applies its AI expertise to genetic diseases
Google’s DeepMind unit has used its artificial intelligence expertise to catalogue millions of ‘missense’ mutations in human DNA that may be related to the development of human diseases.
The work has been made possible due to a new AlphaMissense tool developed by DeepMind researchers, which they say has classified the effects of 71 million mutations, around 89% of all those possible, and deemed them either likely or unlikely to be linked to health conditions.
Missense variants are genetic mutations that affect the structure and function of human proteins, leading to a different amino acid being incorporated into the chain.
While many are harmless, some are known to be involved in diseases, including cystic fibrosis, sickle cell disease, and cancer, according to a blog post posted on DeepMind’s website. So far, the roles of only around 0.1% of missense mutations have been characterised by scientists, and just 6% have been studied.
“With millions of possible mutations and limited experimental data, it’s largely still a mystery which ones could give rise to disease,” write DeepMind’s research scientists Žiga Avsec and Jun Cheng.
“Experiments to uncover disease-causing mutations are expensive and laborious – every protein is unique, and each experiment has to be designed separately, which can take months,” they add. “By using AI predictions, researchers can get a preview of results for thousands of proteins at a time, which can help to prioritise resources and accelerate more complex studies.”
Using the AlphaMissense AI, the team has estimated that 57% of missense mutations cause no harm, while around a third (32%) may be harmful, and the impact of the remaining 11% is uncertain.
The AlphaMissense catalogue could accelerate research into molecular biology and form the basis of research into new diagnostics and treatments, according to its developers. It is an extension of DeepMind’s work on the AlphaFold database, which was released in 2021 and is used to predict the three-dimensional structures of the human proteome.
The average person carries more than 9,000 missense variants, and classifying them is an important first step in understanding what can give rise to disease. The next step will be to understand exactly how harmful missense mutations cause disease in the hope of finding ways to fix the problems they cause.
Like AlphaFold, the new catalogue has been made freely available to the research community, along with gene-level pathogenicity predictions and a dataset of all 216 million possible single amino acid substitutions across 19,233 proteins found in humans. DeepMind has also open-sourced the model code for AlphaMissense.
A paper on the project has just been published in the journal Science.