AI predicts liver cancer risk with "high accuracy"
A machine learning (ML) tool that analyses electronic health record data, test results, and patient demographics can help clinicians identify people at high risk of hepatocellular carcinoma (HCC), the most common form of liver cancer.
That is the conclusion of a study published in the journal Cancer Discovery, which drew on data from the 500,000-person UK Biobank, which includes 538 cases of HCC, more than two-thirds (69%) of which occurred in patients with no risk factors like liver cirrhosis, viral hepatitis, or other chronic liver diseases.
The researchers trained their models on 80% of the data from the UK Biobank and performed an initial validation on the remaining 20%, with an additional validation carried out against the 400,000 All of Us registry in the US, which includes 445 HCC cases.
The overall aim of the study is to find ways to improve on the current approach to identifying people at risk of HCC, which currently focuses mainly on a narrow, high-risk population, using imaging, and blood-based cancer screening – and can miss many at-risk individuals, according to co-lead investigator Carolin Schneider of RWTH Aachen University in German.
"Screening is typically recommended for patients with confirmed liver cirrhosis or severe liver disease, since many cases of HCC occur in these patients, but there are many individuals with undiagnosed cirrhosis or other risk factors who might also benefit," she said.
One version (Model C) of their PRE-Screen-HCC algorithm looked at a broad range of data – including demographics, lifestyle, health records, and blood tests – and was found to stratify individual risk of developing HCC on a population scale "with high accuracy," according to the study authors.
Interestingly, adding genomics and/or metabolomics data, which could be challenging at a population level, did not substantially increase its performance.
"This showed that we can predict HCC risk using simple, readily available data without the need for complex and expensive genetic sequencing," said Schneider, adding that this feature increases the model's potential for widespread use, particularly in resource-limited settings.
HCC is the fifth most common malignancy and the third leading cause of cancer-associated death worldwide, with a rising incidence, driven by rising rates of liver disease, which makes it a major public health concern.
Although trained predominantly on data from white participants in the UK Biobank, PRE-Screen-HCC maintained its performance when evaluated specifically in the non-white subgroup of the more ethnically diverse All of Us cohort, said the researchers, which bodes well for its potential across different populations.
First author Jan Clusmann, from the Technical University in Dresden in Germany, said: "With so many factors impacting risk, there is an urgent need for effective tools to help clinicians identify high-risk patients. Machine learning tools that can simultaneously work with different types of clinical data could be particularly useful for this major clinical challenge."
