Platform tests commercial algorithms for NHS use for bias

UK researchers have developed a platform that can test whether commercial AI algorithms developed for NHS applications are fit for purpose and free of bias.

The platform has been put to work initially to assess whether eight AIs can detect diabetic eye disease by identifying signs of blood vessel damage at the back of the eye "in a fair, equitable, transparent, and trustworthy way," looking at 1.2 million retinal images from over 200,000 screening visits in the North-East London Diabetic Eye Screening Programme.

The result? The team concluded that they all passed the test, with the performance of the eight algorithms compared to images analysed in a trusted research environment of independent researchers by up to three humans – who followed the standard protocol used by the NHS.

The study, which has been published in The Lancet Digital Health, found that the accuracy across all eight AI algorithms in identifying diabetic eye disease potentially in need of clinical intervention was between 83.7% and 98.7%. A prior study of manually graded images by humans found accuracy rates of 75% to 98%.

The model aims to overcome the risk that companies may allow biases to be introduced in supportive data as they try to get their AIs backed for NHS use, and may now be applied to other AI applications.

Prof Alicja Rudnicka at City St George's, University of London, who co-led the study with Adnan Tufail at Moorfields Eye Hospital NHS Foundation Trust, said their approach "delivers the world's first fair, equitable, and transparent evaluation of AI systems to detect sight-threatening diabetic eye disease."

Software used as a medical device (SaMD) technology has rarely been assessed for algorithmic fairness on a large scale, particularly across different populations and ethnicities, according to the team.

"This depth of AI scrutiny is far higher than that ever given to human performance," added Rudnicka. "We've shown that these AI systems are safe for use in the NHS by using enormous data sets and, most importantly, showing that they work well across different ethnicities and age groups."

In 2021, health technology assessment (HTA) agency NICE published advice that said three AIs for detecting diabetic retinopathy could be trialled by the NHS, but stopped short of recommending them for routine commissioning.

With more than 4 million patients with diabetes in the UK who need regular eye checks, the study points to a way to benchmark AI systems to detect sight-threatening diabetic eye disease before a potential mass rollout, said Tufail, a consultant ophthalmic surgeon at Moorfields.

"The approach we have developed paves the way for safer, smarter AI adoption across many healthcare applications," he added.

The team suggest new applications of the approach could include the assessment of AIs for chronic conditions such as cancer and heart disease.

Image by Paul Diaconu from Pixabay