…the only really useful value of artificial intelligence in chest radiography is, at best, to provide triage support — tell us what is normal and what is not, and highlight where it could possibly be abnormal. Just don’t try and claim that AI can definitively tell us what the abnormality
is,because it can’t do so any more accurately than we can because the data is dirty because we made it thus.
This is a generally good article on the challenges of using poorly annotated medical data to train machine learning algorithms. However, there are three points that I think are relevant, which the author doesn’t address at all:
- He assumes that algorithms will only be trained using chest images that have been annotated by human beings. They won’t. In fact, I can’t see why anyone would do this anyway for exactly the reasons he states. What is more likely is that AI will look across a wide range of clinical data points and use the other points in association with the CXR to determine a diagnosis. So, if the (actual) diagnosis is a cardiac issue you’d expect the image to correlate with cardiac markers and assign less weight to infection markers. Likewise, if the diagnosis was pneumonia, you’d see changes in infection markers but wouldn’t have much weighting assigned to cardiac information. In other words, the analysis of CXRs won’t be informed by human-annotated reports; it’ll happen through correlation with all the other clinical information gathered from the patient.
- He starts out by presenting a really detailed argument explaining the incredibly low inter-rater reliability, inaccuracy and weak validity of human judges (in this case, radiologists) when it comes to analysing chest X-rays, but then ends by saying that we should leave the interpretation to them anyway, rather than algorithms.
- He is a radiologist, which should at least make one pause when considering the final recommendation is to leave things to the radiologists.
These points aside, the author makes an excellent case for why we need to make sure that medical data are clean and annotated with machine-readable tags. Well worth a read.