Algorithmic de-skilling of clinical decision-makers

What will we do when we don’t drive most of the time but have a car that hands control to us during an extreme event?

Agrawal, A., Gans, J. & Goldfarb, A. (2018). Prediction Machines: The Simple Economics of Artificial Intelligence.

Before I get to the takehome message, I need to set this up a bit. The way that machine intelligence currently works is that you train an algorithm to recognise patterns in large data sets, often with the help of people who annotate the data in advance. This is known as supervised learning. Sometimes the algorithm can be given data sets that have no annotation (i.e. no supervision), and the output is judged against some criterion and determined to be more or less accurate. This is known as reinforcement learning.

In both cases, the algorithm isn’t trained in the wild but is rather developed within a constrained environment that simulates something of interest in the real world. For example, an algorithm may be trained to deal with uncertainty by playing Starcraft, which mimics the imperfect information state of real-world decision-making. This kind of probabilistic thinking defines many professional decision-making contexts where we have to make a choice but may only be 70% confident that we’re making the right choice.

Eventually, you need to take the algorithm out of the simulated training environment and run it in the real world because this is the only way to find out if it will do what you want it to. In the context of self-driving cars, this represents a high-stakes tradeoff between the benefits of early implementation (more real-world data gathering, more accurate predictions, better autonomous driving capability), and the risks of making the wrong decision (people might die).

Even in a scenario where the algorithm has been trained to very high levels in simulation and then introduced at precisely the right time so as to maximise the learning potential while also minimising risk, it will still hardly ever have been exposed to rare events. We will be in the situation where cars will have autonomy in almost all driving contexts, except those where there is a real risk of someone being hurt or killed. At that moment, because of the limitations of its training, it will hand control of the vehicle back to the driver. And there is the problem. How long will it take for drivers to lose the skills that are necessary for them to make the right choice in that rare event?

Which brings me to my point. Will we see the same loss of skills in the clinical context? Over time, algorithms will take over more and more of our clinical decision-making in much the same way that they’ll take over the responsibilities of a driver. And in almost all situations they’ll make more accurate predictions than a person. However, in some rare cases, the confidence level of the prediction will drop enough to lead to control being handed back to the clinician. Unfortunately, at this point, the clinician likely hasn’t been involved in clinical decision-making for an extended period and so, just when human judgement is determined to be most important, it may also be at it’s most limited.

How will clinicians maintain their clinical decision-making skills at the levels required to take over in rare events, when they are no longer involved in the day-to-day decision-making that hones that same skill?

18 March 2019 Update: The Digital Doctor: Will surgeons lose their skills in the age of automation? AI Med.

Defensive Diagnostics: the legal implications of AI in radiology

Doctors are human. And humans make mistakes. And while scientific advancements have dramatically improved our ability to detect and treat illness, they have also engendered a perception of precision, exactness and infallibility. When patient expectations collide with human error, malpractice lawsuits are born. And it’s a very expensive problem.

Source: Defensive Diagnostics: the legal implications of AI in radiology

There are few things to note in this article. The first, and most obvious, was that we have a much higher standard for AI-based expert systems (i.e. algorithmic diagnosis and prediction) than we do for human experts. Our expectations for algorithmic clinical decision-making are far more exacting than those we have for physicians. It seems strange that we accept the fallibility of human beings but expect nothing less than perfection from AI-based systems. [1]

Medical errors are more frequent than anyone cares to admit. In radiology, the retrospective error rate is approximately 30% across all specialities, with real-time error rates in daily practice averaging between 3% and 5%.

The second takeaway was that one of the most significant areas of influence for AI in clinical settings may not be in the primary diagnosis but rather the follow up analysis that  highlights potential mistakes that the clinician may have made. These applications of AI for secondary diagnostic review will be cheap and won’t add any additional workload to healthcare professionals. They will simply review the clinician’s conclusion and flag those cases that may benefit from additional testing. Of course, this will probably be driven by patient litigation.

[1] Incidentally, the same principle seems to be true for self-driving cars; we expect nothing but a perfect safety record for autonomous vehicles but are quite happy with the status quo for human drivers (1.2 million traffic-related deaths in a single year). Where is the moral panic around the mass slaughter of human beings by human drivers? If an algorithm is only slightly safer than a human being behind the wheel of a car it would result in thousands fewer deaths per year. And yet it feels like we’re going to delay the introduction of autonomous cars until they meet some perfect standard. To me at least, that seems morally wrong.