Research project exploring clinicians’ perspectives of the introduction of ML into clinical practice

I recently received ethics clearance to begin an explorative study looking at how physiotherapists think about the introduction of machine learning into clinical practice. The study will use an international survey and a series of interviews to gather data on clinicians’ perspectives on questions like the following:

  • What aspects of clinical practice are vulnerable to automation?
  • How do we think about trust when it comes to AI-based clinical decision support?
  • What is the role of the clinician in guiding the development of AI in clinical practice?

I’m busy finalising the questionnaire and hope to have the survey up and running in a couple of weeks, with more focused interviews following. If these kinds of questions interest you and you’d like to have a say in answering them, keep an eye out for a call to respond.

Here is the study abstract (contact me if you’d like more detailed information):

Background: Artificial intelligence (AI) is a branch of computer science that aims to embed intelligent behaviour into software in order to achieve certain objectives. Increasingly, AI is being integrated into a variety of healthcare and clinical applications and there is significant research and funding being directed at improving the performance of these systems in clinical practice. Clinicians in the near future will find themselves working with information networks on a scale well beyond the capacity of human beings to grasp, thereby necessitating the use of intelligent machines to analyse and interpret the complex interactions of data, patients and clinical decision-making.

Aim: In order to ensure that we successfully integrate machine intelligence with the essential human characteristics of empathic, caring and creative clinical practice, we need to first understand how clinicians perceive the introduction of AI into professional practice.

Methods: This study will make use of an explorative design to gather qualitative data via an online survey and a series of interviews with physiotherapy clinicians from around the world. The survey questionnaire will be self-administered and piloted for validity and ambiguity, and the interview guide informed by the study aim. The population for both survey and interviews will consist of physiotherapy clinicians from around the world. This is an explorative study with a convenient sample, therefore no a priori sample size will be calculated.

Article published – An introduction to machine learning for clinicians

It’s a nice coincidence that my article on machine learning for clinicians has been published at around the same time that my poster on a similar topic was presented at WCPT. I’m quite happy with this paper and think it offers a useful overview of the topic of machine learning that is specific to clinical practice and which will help clinicians understand what is at times a confusing topic. The mainstream media (and, to be honest, many academics) conflate a wide variety of terms when they talk about artificial intelligence, and this paper goes some way towards providing some background information for anyone interested in how this will affect clinical work. You can download the preprint here.


Abstract

The technology at the heart of the most innovative progress in health care artificial intelligence (AI) is in a sub-domain called machine learning (ML), which describes the use of software algorithms to identify patterns in very large data sets. ML has driven much of the progress of health care AI over the past five years, demonstrating impressive results in clinical decision support, patient monitoring and coaching, surgical assistance, patient care, and systems management. Clinicians in the near future will find themselves working with information networks on a scale well beyond the capacity of human beings to grasp, thereby necessitating the use of intelligent machines to analyze and interpret the complex interactions between data, patients, and clinical decision-makers. However, as this technology becomes more powerful it also becomes less transparent, and algorithmic decisions are therefore increasingly opaque. This is problematic because computers will increasingly be asked for answers to clinical questions that have no single right answer, are open-ended, subjective, and value-laden. As ML continues to make important contributions in a variety of clinical domains, clinicians will need to have a deeper understanding of the design, implementation, and evaluation of ML to ensure that current health care is not overly influenced by the agenda of technology entrepreneurs and venture capitalists. The aim of this article is to provide a non-technical introduction to the concept of ML in the context of health care, the challenges that arise, and the resulting implications for clinicians.

WCPT poster: Introduction to machine learning in healthcare

It’s a bit content-heavy and not as graphic-y as I’d like but c’est la vie.

I’m quite proud of what I think is a novel innovation in poster design; the addition of the tl;dr column before the findings. In other words, if you only have 30 seconds to look at the poster then that’s the bit you want to focus on. Related to this, I’ve also moved the Background, Methods and Conclusion sections to the bottom and made them smaller so as to emphasise the Findings, which are placed first.

My full-size poster on machine learning in healthcare for the 2019 WCPT conference in Geneva.

Reference list (download this list as a Word document)

  1. Yang, C. C., & Veltri, P. (2015). Intelligent healthcare informatics in big data era. Artificial Intelligence in Medicine, 65(2), 75–77. https://doi.org/10.1016/j.artmed.2015.08.002
  2. Qayyum, A., Anwar, S. M., Awais, M., & Majid, M. (2017). Medical image retrieval using deep convolutional neural network. Neurocomputing, 266, 8–20. https://doi.org/10.1016/j.neucom.2017.05.025
  3. Li, Z., Zhang, X., Müller, H., & Zhang, S. (2018). Large-scale retrieval for medical image analytics: A comprehensive review. Medical Image Analysis, 43, 66–84. https://doi.org/10.1016/j.media.2017.09.007
  4. Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118. https://doi.org/10.1038/nature21056
  5. Pratt, H., Coenen, F., Broadbent, D. M., Harding, S. P., & Zheng, Y. (2016). Convolutional Neural Networks for Diabetic Retinopathy. Procedia Computer Science, 90, 200–205. https://doi.org/10.1016/j.procs.2016.07.014
  6. Ramzan, M., Shafique, A., Kashif, M., & Umer, M. (2017). Gait Identification using Neural Network. International Journal of Advanced Computer Science and Applications, 8(9). https://doi.org/10.14569/IJACSA.2017.080909
  7. Kidziński, Ł., Delp, S., & Schwartz, M. (2019). Automatic real-time gait event detection in children using deep neural networks. PLOS ONE, 14(1), e0211466. https://doi.org/10.1371/journal.pone.0211466
  8. Horst, F., Lapuschkin, S., Samek, W., Müller, K.-R., & Schöllhorn, W. I. (2019). Explaining the Unique Nature of Individual Gait Patterns with Deep Learning. Scientific Reports, 9(1), 2391. https://doi.org/10.1038/s41598-019-38748-8
  9. Cai, T., Giannopoulos, A. A., Yu, S., Kelil, T., Ripley, B., Kumamaru, K. K., … Mitsouras, D. (2016). Natural Language Processing Technologies in Radiology Research and Clinical Applications. RadioGraphics, 36(1), 176–191. https://doi.org/10.1148/rg.2016150080
  10. Jackson, R. G., Patel, R., Jayatilleke, N., Kolliakou, A., Ball, M., Gorrell, G., … Stewart, R. (2017). Natural language processing to extract symptoms of severe mental illness from clinical text: The Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project. BMJ Open, 7(1), e012012. https://doi.org/10.1136/bmjopen-2016-012012
  11. Kreimeyer, K., Foster, M., Pandey, A., Arya, N., Halford, G., Jones, S. F., … Botsis, T. (2017). Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. Journal of Biomedical Informatics, 73, 14–29. https://doi.org/10.1016/j.jbi.2017.07.012
  12. Montenegro, J. L. Z., Da Costa, C. A., & Righi, R. da R. (2019). Survey of Conversational Agents in Health. Expert Systems with Applications. https://doi.org/10.1016/j.eswa.2019.03.054
  13. Carrell, D. S., Schoen, R. E., Leffler, D. A., Morris, M., Rose, S., Baer, A., … Mehrotra, A. (2017). Challenges in adapting existing clinical natural language processing systems to multiple, diverse health care settings. Journal of the American Medical Informatics Association, 24(5), 986–991. https://doi.org/10.1093/jamia/ocx039
  14. Oña, E. D., Cano-de la Cuerda, R., Sánchez-Herrera, P., Balaguer, C., & Jardón, A. (2018). A Review of Robotics in Neurorehabilitation: Towards an Automated Process for Upper Limb. Journal of Healthcare Engineering, 2018, 1–19. https://doi.org/10.1155/2018/9758939
  15. Krebs, H. I., & Volpe, B. T. (2015). Robotics: A Rehabilitation Modality. Current Physical Medicine and Rehabilitation Reports, 3(4), 243–247. https://doi.org/10.1007/s40141-015-0101-6
  16. Leng, M., Liu, P., Zhang, P., Hu, M., Zhou, H., Li, G., … Chen, L. (2019). Pet robot intervention for people with dementia: A systematic review and meta-analysis of randomized controlled trials. Psychiatry Research, 271, 516–525. https://doi.org/10.1016/j.psychres.2018.12.032
  17. Jennifer Piatt, P., Shinichi Nagata, M. S., Selma Šabanović, P., Wan-Ling Cheng, M. S., Casey Bennett, P., Hee Rin Lee, M. S., & David Hakken, P. (2017). Companionship with a robot? Therapists’ perspectives on socially assistive robots as therapeutic interventions in community mental health for older adults. American Journal of Recreation Therapy, 15(4), 29–39. https://doi.org/10.5055/ajrt.2016.0117
  18. Troccaz, J., Dagnino, G., & Yang, G.-Z. (2019). Frontiers of Medical Robotics: From Concept to Systems to Clinical Translation. Annual Review of Biomedical Engineering, 21(1). https://doi.org/10.1146/annurev-bioeng-060418-052502
  19. Riek, L. D. (2017). Healthcare Robotics. ArXiv:1704.03931 [Cs]. Retrieved from http://arxiv.org/abs/1704.03931
  20. Kappassov, Z., Corrales, J.-A., & Perdereau, V. (2015). Tactile sensing in dexterous robot hands — Review. Robotics and Autonomous Systems, 74, 195–220. https://doi.org/10.1016/j.robot.2015.07.015
  21. Choi, C., Schwarting, W., DelPreto, J., & Rus, D. (2018). Learning Object Grasping for Soft Robot Hands. IEEE Robotics and Automation Letters, 3(3), 2370–2377. https://doi.org/10.1109/LRA.2018.2810544
  22. Shortliffe, E., & Sepulveda, M. (2018). Clinical Decision Support in the Era of Artificial Intelligence. Journal of the American Medical Association.
  23. Attema, T., Mancini, E., Spini, G., Abspoel, M., de Gier, J., Fehr, S., … Sloot, P. M. A. (n.d.). A new approach to privacy-preserving clinical decision support systems. 15.
  24. Castaneda, C., Nalley, K., Mannion, C., Bhattacharyya, P., Blake, P., Pecora, A., … Suh, K. S. (2015). Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. Journal of Clinical Bioinformatics, 5(1). https://doi.org/10.1186/s13336-015-0019-3
  25. Gianfrancesco, M. A., Tamang, S., Yazdany, J., & Schmajuk, G. (2018). Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data. JAMA Internal Medicine, 178(11), 1544. https://doi.org/10.1001/jamainternmed.2018.3763
  26. Kliegr, T., Bahník, Š., & Fürnkranz, J. (2018). A review of possible effects of cognitive biases on interpretation of rule-based machine learning models. ArXiv:1804.02969 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1804.02969
  27. Weng, S. F., Reps, J., Kai, J., Garibaldi, J. M., & Qureshi, N. (2017). Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLOS ONE, 12(4), e0174944. https://doi.org/10.1371/journal.pone.0174944
  28. Suresh, H., Hunt, N., Johnson, A., Celi, L. A., Szolovits, P., & Ghassemi, M. (2017). Clinical Intervention Prediction and Understanding using Deep Networks. ArXiv:1705.08498 [Cs]. Retrieved from http://arxiv.org/abs/1705.08498
  29. Vayena, E., Blasimme, A., & Cohen, I. G. (2018). Machine learning in medicine: Addressing ethical challenges. PLOS Medicine, 15(11), e1002689. https://doi.org/10.1371/journal.pmed.1002689
  30. Verghese, A., Shah, N. H., & Harrington, R. A. (2018). What This Computer Needs Is a Physician: Humanism and Artificial Intelligence. JAMA, 319(1), 19. https://doi.org/10.1001/jama.2017.19198

Comment: In competition, people get discouraged by competent robots

After each round, participants filled out a questionnaire rating the robot’s competence, their own competence and the robot’s likability. The researchers found that as the robot performed better, people rated its competence higher, its likability lower and their own competence lower.

Lefkowitz, M. (2019). In competition, people get discouraged by competent robots. Cornell Chronicle.

This is worth noting since it seems increasingly likely that we’ll soon be working, not only with more competent robots but also with more competent software. There are already concerns around how clinicians will respond to the recommendations of clinical decision-support systems, especially when those systems make suggestions that are at odds with the clinician’s intuition.

Paradoxically, the effect may be even worse with expert clinicians who may not always be able to explain their decision-making. Novices, who use more analytical frameworks (or even basic algorithms like, IF this, THEN that) may find it easier to modify their decisions because their reasoning is more “visible” (System 2). Experts, who rely more on subconscious pattern recognition (System 1), may be less able to identify where in their reasoning process they were victim to confounders like confirmation or availability bia, and so less likely to modify their decisions.

It seems really clear that we need to start thinking about how we’re going to prepare current and future clinicians for the arrival of intelligent agents in the clinical context. If we start disregarding the recommendations of clinical decision support systems, not because they produce errors in judgement but because we simply don’t like them, then there’s a strong case to be made that it is the human that we cannot trust.


Contrast this with automation bias, which is the tendency to give more credence to decisions made by machines because of a misplaced notion that algorithms are simply more trustworthy than people.

Comment: Artificial intelligence turns brain activity into speech

People who have lost the ability to speak after a stroke or disease can use their eyes or make other small movements to control a cursor or select on-screen letters. (Cosmologist Stephen Hawking tensed his cheek to trigger a switch mounted on his glasses.) But if a brain-computer interface could re-create their speech directly, they might regain much more: control over tone and inflection, for example, or the ability to interject in a fast-moving conversation.

Servick, K. (2019). Artificial intelligence turns brain activity into speech. Science.

To be clear, this research doesn’t describe the artificial recreation of imagined speech i.e. the internal speech that each of us hears as part of the personal monologue of our own subjective experiences. Rather, it maps the electrical activity in the areas of the brain that are responsible for the articulation of speech as the participant reads or listens to sounds being played back to them. Nonetheless, it’s an important step for patients who have suffered damage to those areas of the brain responsible for speaking.

I also couldn’t help but get excited about the following; when electrical signals from the brain are converted into digital information (as they would have to be here, in order to do the analysis and speech synthesis) then why not also transmit that digital information over wifi? If it’s possible for me to understand you “thinking about saying words”, instead of using your muscles of articulation to actually say them, how long will it be before you can send those words to me over a wireless connection?

Giving algorithms a sense of uncertainty could make them more ethical

The algorithm could handle this uncertainty by computing multiple solutions and then giving humans a menu of options with their associated trade-offs. Say the AI system was meant to help make medical decisions. Instead of recommending one treatment over another, it could present three possible options: one for maximizing patient life span, another for minimizing patient suffering, and a third for minimizing cost. “Have the system be explicitly unsure and hand the dilemma back to the humans.”

Hao, K. (2019). Giving algorithms a sense of uncertainty could make them more ethical. MIT Technology Review.

I think about clinical reasoning like this; it’s what we call the kind of probabilistic thinking where we take a bunch of – sometimes contradictory – data and try to make a decision that can have varying levels of confidence. For example, “If A, then probably D. But if A and B, then unlikely to be D. If C, then definitely not D”. Algorithms (and novice clinicians) are quite poor at this kind of reasoning, which is why they’ve traditionally not been used for clinical decision-making and ethical reasoning (and why novice clinicians tend not to handle clinical uncertainty very well). But if it turns out that machine learning algorithms are able to manage conditions of uncertainty and provide a range of options that humans can act on, given a wide variety of preferences and contexts, it may be that machines will be one step closer to doing our reasoning for us.

Comment: Separating the Art of Medicine from Artificial Intelligence

…the only really useful value of artificial intelligence in chest radiography is, at best, to provide triage support — tell us what is normal and what is not, and highlight where it could possibly be abnormal. Just don’t try and claim that AI can definitively tell us what the abnormality is, because it can’t do so any more accurately than we can because the data is dirty because we made it thus.

This is a generally good article on the challenges of using poorly annotated medical data to train machine learning algorithms. However, there are three points that I think are relevant, which the author doesn’t address at all:

  1. He assumes that algorithms will only be trained using chest images that have been annotated by human beings. They won’t. In fact, I can’t see why anyone would do this anyway for exactly the reasons he states. What is more likely is that AI will look across a wide range of clinical data points and use the other points in association with the CXR to determine a diagnosis. So, if the (actual) diagnosis is a cardiac issue you’d expect the image to correlate with cardiac markers and assign less weight to infection markers. Likewise, if the diagnosis was pneumonia, you’d see changes in infection markers but wouldn’t have much weighting assigned to cardiac information. In other words, the analysis of CXRs won’t be informed by human-annotated reports; it’ll happen through correlation with all the other clinical information gathered from the patient.
  2. He starts out by presenting a really detailed argument explaining the incredibly low inter-rater reliability, inaccuracy and weak validity of human judges (in this case, radiologists) when it comes to analysing chest X-rays, but then ends by saying that we should leave the interpretation to them anyway, rather than algorithms.
  3. He is a radiologist, which should at least make one pause when considering the final recommendation is to leave things to the radiologists.

These points aside, the author makes an excellent case for why we need to make sure that medical data are clean and annotated with machine-readable tags. Well worth a read.

Algorithmic de-skilling of clinical decision-makers

What will we do when we don’t drive most of the time but have a car that hands control to us during an extreme event?

Agrawal, A., Gans, J. & Goldfarb, A. (2018). Prediction Machines: The Simple Economics of Artificial Intelligence.

Before I get to the takehome message, I need to set this up a bit. The way that machine intelligence currently works is that you train an algorithm to recognise patterns in large data sets, often with the help of people who annotate the data in advance. This is known as supervised learning. Sometimes the algorithm can be given data sets that have no annotation (i.e. no supervision), and the output is judged against some criterion and determined to be more or less accurate. This is known as reinforcement learning.

In both cases, the algorithm isn’t trained in the wild but is rather developed within a constrained environment that simulates something of interest in the real world. For example, an algorithm may be trained to deal with uncertainty by playing Starcraft, which mimics the imperfect information state of real-world decision-making. This kind of probabilistic thinking defines many professional decision-making contexts where we have to make a choice but may only be 70% confident that we’re making the right choice.

Eventually, you need to take the algorithm out of the simulated training environment and run it in the real world because this is the only way to find out if it will do what you want it to. In the context of self-driving cars, this represents a high-stakes tradeoff between the benefits of early implementation (more real-world data gathering, more accurate predictions, better autonomous driving capability), and the risks of making the wrong decision (people might die).

Even in a scenario where the algorithm has been trained to very high levels in simulation and then introduced at precisely the right time so as to maximise the learning potential while also minimising risk, it will still hardly ever have been exposed to rare events. We will be in the situation where cars will have autonomy in almost all driving contexts, except those where there is a real risk of someone being hurt or killed. At that moment, because of the limitations of its training, it will hand control of the vehicle back to the driver. And there is the problem. How long will it take for drivers to lose the skills that are necessary for them to make the right choice in that rare event?

Which brings me to my point. Will we see the same loss of skills in the clinical context? Over time, algorithms will take over more and more of our clinical decision-making in much the same way that they’ll take over the responsibilities of a driver. And in almost all situations they’ll make more accurate predictions than a person. However, in some rare cases, the confidence level of the prediction will drop enough to lead to control being handed back to the clinician. Unfortunately, at this point, the clinician likely hasn’t been involved in clinical decision-making for an extended period and so, just when human judgement is determined to be most important, it may also be at it’s most limited.

How will clinicians maintain their clinical decision-making skills at the levels required to take over in rare events, when they are no longer involved in the day-to-day decision-making that hones that same skill?


18 March 2019 Update: The Digital Doctor: Will surgeons lose their skills in the age of automation? AI Med.

Questions for Artificial Intelligence in Health Care

Artificial intelligence (AI) is gaining high visibility in the realm of health care innovation. Broadly defined, AI is a field of computer science that aims to mimic human intelligence with computer systems. This mimicry is accomplished through iterative, complex pattern matching, generally at a speed and scale that exceed human capability. Proponents suggest, often enthusiastically, that AI will revolutionize health care for patients and populations. However, key questions must be answered to translate its promise into action.

Maddox, TM, Rumsfeld, JS, Payne, PR. (2018). Questions for Artificial Intelligence in Health Care. JAMA. Published online December 10, 2018. doi:10.1001/jama.2018.1893.

The questions and follow-up responses presented in the article are useful, highlighting the nuance that is often ignored in mainstream pieces that tend to focus on the extreme potential of the technology (i.e. what this might one day be like) rather than the more subtle implications that we need to consider today. The following text is verbatim from the article:

  1. What are the right tasks for AI in healthcare? AI is best used when the primary task is identifying clinically useful patterns in large, high-dimensional data sets. AI is most likely to succeed when used with high-quality data sources on which to “learn” and classify data in relation to outcomes. However, most clinical data, whether from electronic health records (EHRs) or medical billing claims, remain ill-defined and largely insufficient for effective exploitation by AI techniques.
  2. What are the right data for AI? AI is most likely to succeed when used with high-quality data sources on which to “learn” and classify data in relation to outcomes. However, most clinical data, whether from electronic health records (EHRs) or medical billing claims, remain ill-defined and largely insufficient for effective exploitation by AI techniques.
  3. What is the right evidence standard for AI? Innovations in medications and medical devices are required to undergo extensive evaluation, often including randomized clinical trials and postmarketing surveillance, to validate clinical effectiveness and safety. If AI is to directly influence and improve clinical care delivery, then an analogous evidence standard is needed to demonstrate improved outcomes and a lack of unintended consequences.
  4. What are the right approaches for integrating AI into clinical care? Even after the correct tasks, data, and evidence for AI are addressed, realization of its potential will not occur without effective integration into clinical care. To do so requires that clinicians develop a facility with interpreting and integrating AI-supported insights in their clinical care.

Split learning for health: Distributed deep learning without sharing raw patient data

Can health entities collaboratively train deep learning models without sharing sensitive raw data? This paper proposes several configurations of a distributed deep learning method called SplitNN to facilitate such collaborations. SplitNN does not share raw data or model details with collaborating institutions. The proposed configurations of splitNN cater to practical settings of i) entities holding different modalities of patient data, ii) centralized and local health entities collaborating on multiple task

Source: [1812.00564] Split learning for health: Distributed deep learning without sharing raw patient data

The paper describes how algorithm design (including training) can be shared across different organisations without each having access to each other’s resources.

This has important implications for the development of AI-based health applications, in that hospitals and other service providers need not share raw patient data with companies like Google/DeepMind. Health organisations could do the basic algorithm design in-house with the smaller, local data sets and then send the algorithm to organisations that have the massive data sets necessary for refining the algorithm, all without exposing the initial data and protecting patient privacy.