Controversial healthcare app maker Babylon Health has criticised the doctor who first raised concerns about the safety of their AI chatbot.
Babylon Health’s chatbot is available in the company’s GP at Hand app, a digital healthcare solution championed by health secretary Matt Hancock that was also integrated into Samsung Health since last year.
The chatbot aims to reduce the burden on GPs and A&E departments by automating the triage process to determine whether someone can treat themselves at home, should book an online or in-person GP appointment, or go straight to a hospital.
A Twitter user under the pseudonym of Dr Murphy first reached out to us back in 2018 alleging that Babylon Health’s chatbot was giving unsafe advice. Dr Murphy recently unveiled himself as Dr David Watkins and went public with his findings at The Royal Society of Medicine’s “Recent developments in AI and digital health 2020“ event in addition to appearing on a BBC Newsnight report.
Over the past couple of years, Dr Watkins has provided many examples of the chatbot giving dangerous advice. In one example, an obese 48-year-old heavy smoker patient who presented himself with chest pains was suggested to book a consultation “in the next few hours”. Anyone with any common sense would have told you to dial an emergency number straight away.
This particular issue has since been rectified but Dr Watkins has highlighted many further examples over the years which show, very clearly, there are serious safety issues.
In a press release (PDF) on Monday, Babylon Health calls Dr Watkins a “troll” who has “targeted members of our staff, partners, clients, regulators and journalists and tweeted defamatory content about us”.
According to the release, Dr Watkins has conducted 2,400 tests of the chatbot in a bid to discredit the service while raising “fewer than 100 test results which he considered concerning”.
Babylon Health claims that in just 20 cases did Dr Watkins find genuine errors while others were “misrepresentations” or “mistakes,” according to Babylon’s own “panel of senior clinicians” who remain unnamed.
Speaking to TechCrunch, Dr Watkins called Babylon’s claims “utterly nonsense” and questions where the startup got its figures from as “there are certainly not 2,400 completed triage assessments”.
Dr Watkins estimates he has conducted between 800 and 900 full triages, some of which were repeat tests to see whether Babylon Health had fixed the issues he previously highlighted.
The doctor acknowledges Babylon Health’s chatbot has improved and has issues around the rate of around one in three instances. In 2018, when Dr Watkins first reached out to us and other outlets, he says this rate was “one in one”.
While it’s one account versus the other, the evidence shows that Babylon Health’s chatbot has issued dangerous advice on a number of occasions. Dr Watkins has dedicated many hours to highlighting these issues to Babylon Health in order to improve patient safety.
Rather than welcome his efforts and work with Dr Watkins to improve their service, it seems Babylon Health has decided to go on the offensive and “try and discredit someone raising patient safety concerns”.
In their press release, Babylon accuses Watkins of posting “over 6,000” misleading attacks but without giving details of where. Dr Watkins primarily uses Twitter to post his findings. His account, as of writing, has tweeted a total of 3,925 times and not just about Babylon’s service.
This isn’t the first time Babylon Health’s figures have come into question. Back in June 2018, Babylon Health held an event where it boasted its AI beat trainee GPs at the MRCGP exam used for testing their ability to diagnose medical problems. The average pass mark is 72 percent. “How did Babylon Health do?” said Dr Mobasher Butt at the event, a director at Babylon Health. “It got 82 percent.”
Given the number of dangerous suggestions to trivial ailments the chatbot has given, especially at the time, it’s hard to imagine the claim that it beats trainee GPs as being correct. Intriguingly, the video of the event has since been deleted from Babylon Health’s YouTube account and the company removed all links to coverage of it from the “Babylon in the news” part of its website.
When asked why it deleted the content, Babylon Health said in a statement: “As a fast-paced and dynamic health-tech company, Babylon is constantly refreshing the website with new information about our products and services. As such, older content is often removed to make way for the new.”
AI solutions like those offered by Babylon Health will help to reduce the demand on health services and ensure people have access to the right information and care whenever and wherever they need it. However, patient safety must come first.
Mistakes are less forgivable in healthcare due to the risk of potentially fatal or lifechanging consequences. The usual “move fast and break things” ethos in tech can’t apply here.
There’s a general acceptance that rarely is a new technology going to be without its problems, but people want to see that best efforts are being made to limit and address those issues. Instead of welcoming those pointing out issues with their service before it leads to a serious incident, it seems Babylon Health would rather blame everyone else for its faults.
Interested in hearing industry leaders discuss subjects like this? Attend the co-located 5G Expo, IoT Tech Expo, Blockchain Expo, AI & Big Data Expo, and Cyber Security & Cloud Expo World Series with upcoming events in Silicon Valley, London, and Amsterdam.