…the only really useful value of artificial intelligence in chest radiography is, at best, to provide triage support — tell us what is normal and what is not, and highlight where it could possibly be abnormal. Just don’t try and claim that AI can definitively tell us what the abnormality is, because it can’t do so any more accurately than we can because the data is dirty because we made it thus.
This is a generally good article on the challenges of using poorly annotated medical data to train machine learning algorithms. However, there are three points that I think are relevant, which the author doesn’t address at all:
- He assumes that algorithms will only be trained using chest images that have been annotated by human beings. They won’t. In fact, I can’t see why anyone would do this anyway for exactly the reasons he states. What is more likely is that AI will look across a wide range of clinical data points and use the other points in association with the CXR to determine a diagnosis. So, if the (actual) diagnosis is a cardiac issue you’d expect the image to correlate with cardiac markers and assign less weight to infection markers. Likewise, if the diagnosis was pneumonia, you’d see changes in infection markers but wouldn’t have much weighting assigned to cardiac information. In other words, the analysis of CXRs won’t be informed by human-annotated reports; it’ll happen through correlation with all the other clinical information gathered from the patient.
- He starts out by presenting a really detailed argument explaining the incredibly low inter-rater reliability, inaccuracy and weak validity of human judges (in this case, radiologists) when it comes to analysing chest X-rays, but then ends by saying that we should leave the interpretation to them anyway, rather than algorithms.
- He is a radiologist, which should at least make one pause when considering the final recommendation is to leave things to the radiologists.
These points aside, the author makes an excellent case for why we need to make sure that medical data are clean and annotated with machine-readable tags. Well worth a read.
The conversation about unconscious bias in artificial intelligence often focuses on algorithms that unintentionally cause disproportionate harm to entire swaths of society…But the problem could run much deeper than that. Society should be on guard for another twist: the possibility that nefarious actors could seek to attack artificial intelligence systems by deliberately introducing bias into them, smuggled inside the data that helps those systems learn.
Source: Yeung, D. (2018). When AI Misjudgment Is Not an Accident.
I’m not sure how this might apply to clinical practice but, given our propensity for automation bias, it seems that this is the kind of thing that we need to be aware of. It’s not just that algorithms will make mistakes but that people may intentionally set them up to do so by introducing biased data into the training dataset. Instead of hacking into databases to steal data, we may start seeing database hacks that insert new data into them, with the intention of changing our behaviour.
What this suggests is that bias is a systemic challenge—one requiring holistic solutions. Proposed fixes to unintentional bias in artificial intelligence seek to advance workforce diversity, expand access to diversified training data, and build in algorithmic transparency (the ability to see how algorithms produce results).
“We essentially gathered hateful tweets and used language processing to find the other terms that were associated with such messages… We learned these terms and used them as the bridge to new terms—as long as we have those words, we have a link to anything they can come up with.” This defeats attempts to conceal racist slurs using codes by targeting the language that makes up the cultural matrix from which the hate emerges, instead of just seeking out keywords. Even if the specific slurs used by racists change in order to escape automated comment moderation, the other terms they use to identify themselves and their communities likely won’t.
Source: Pearson, J. (2017). AI Can Now Identify Racist Code Words on Social Media.
There are a few things I thought are worth noting:
- The developers of this algorithm used Tweets to identify the hateful language, which says something about the general quality of discourse on Twitter.
- The algorithm isn’t simply substituting one set of keywords with another; it identifies the context of the sentence in order to determine if the sentiment is hateful. The specific words almost don’t matter. This is a significant step in natural language processing.
- The post appeared in 2017 so it’s a year old and I haven’t looked to see what (if any) progress has been made since then.
This is the point at which the risk from medical AI becomes much greater. Our inability to explain exactly how AI systems reach certain decisions is well-documented. And, as we’ve seen with self-driving car crashes, when humans take our hands off the wheel, there’s always a chance that a computer will make a fatal error in judgment.
Source: Vincent, J. (2018). DeepMind’s AI can detect over 50 eye diseases as accurately as a doctor.
This is just lazy. “When humans take their hands off the wheel…”? OK, then who is responsible for all the death and suffering that happens when humans have their hands on the wheel? Thinking about this for 3 seconds should make it clear that human beings are responsible for almost all human deaths. Getting to the point where we take our hands off the wheel (and off the scalpel, and off the prescription charts, and off the stock exchange) could be the safest thing we will ever do.
Also, DeepMind has moved on from only being able to diagnose diabetic retinopathy, to accurately identifying 50 different conditions of the eye. Tomorrow, it’ll be more.