Comment: Training a single AI model can emit as much carbon as five cars in their lifetimes

The results underscore another growing problem in AI, too: the sheer intensity of resources now required to produce paper-worthy results has made it increasingly challenging for people working in academia to continue contributing to research. “This trend toward training huge models on tons of data is not feasible for academics…because we don’t have the computational resources. So there’s an issue of equitable access between researchers in academia versus researchers in industry.”

Hao, K. (2019). Training a single AI model can emit as much carbon as five cars in their lifetimes. MIT Technology Review.

The article focuses on the scale of the financial and environmental cost of training natural language processing (NLP) models, comparing the carbon emissions of various AI models to those of a car throughout its lifetime. To be honest, this isn’t something I’ve given much thought to but to see it visually really drives the point home.

As much as this is a cause for concern, I’m less worried about this in the long term for the following reason. As the author’s in the article stake, the code and models for AI and NLP are currently really inefficient; they don’t need to be neat and compute is relatively easy to come by (if you’re Google and Facebook). I think that the models will get more efficient, as is evident by the fact that new computer vision algorithms can get to the same outcomes with datasets that are orders of magnitude smaller than was previously possible.

For me though, the quote that I’ve pulled from the article to start this post is more compelling. If the costs of modeling NLP are so high, it seems likely that companies like Google, Facebook and Amazon will be the only ones who can do the high end research necessary to drive the field forward. Academics at universities have an incentive to create more efficient models, which they publish and which then allow companies to take advantage of those new models while at the same time, having access to much more computational resources.

From where I’m standing this makes it seem that private companies will always be at the forefront of AI development, which makes me less optimistic than if it were driven by academics. Maybe I’m just being naive (and probably also biased) but this seems less than ideal.

You can find the full paper here on arxiv.

An introduction to artificial intelligence in clinical practice and education

Two weeks ago I presented some of my thoughts on the implications of AI and machine learning in clinical practice and health professions education at the 2018 SAAHE conference. Here are the slides I used (20 slides for 20 seconds each) with a very brief description of each slide. This presentation is based on a paper I submitted to OpenPhysio, called: “Artificial intelligence in clinical practice: Implications for physiotherapy education“.


The graph shows how traffic to a variety of news websites changed after Facebook made a change to their Newsfeed algorithm, highlighting the influence that algorithms have on the information presented to us, and how we no longer really make real choices about what to read. When algorithms are responsible for filtering what we see, they have power over what we learn about the world.


The graph shows the near flat line of social development and population growth until the invention of the steam engine. Before that all of the Big Ideas we came up with had relatively little impact on our physical well-being. If your grandfather spent his life pushing a plough there was an excellent chance that you’d spend your life pushing one too. But once we figured out how to augment our physical abilities with machines we saw significant advances in society and industry and an associated improvement in everyones quality of life.


The emergence of artificial intelligence in the form of narrowly constrained machine learning algorithms has demonstrated the potential for important advances in cognitive augmentation. Basically, we are starting to really figure out how to use computers to enhance our intelligence. However, we must remember that we’ve been augmenting our cognitive ability for a long time, from exporting our memories onto external devices, to performing advanced computation beyond the capacity of our brains.


The enthusiasm with which modern AI is being embraced is not new. The research and engineering aspects of artificial intelligence have been around since the 1950s, while fictional AI has an even longer history. The field has been through a series of highs and lows (called AI Winters). The developments during these cycles were fueled by what has become known as Good Old Fashioned AI; early attempts to explicitly design decision-making into algorithms by hard coding all possible variations of the interactions in a closed-environment. Understandably, these systems were brittle and unable to adapt to even small changes in context. This is one of the reasons that previous iterations of AI had little impact in the real world.


There are 3 main reasons why it’s different this time. The first is the emergence of cheap but powerful hardware (mainly central and graphics processing units), which has seen computational power growing by a factor of 10 every 4 years. The second characteristic is the exponential growth of data, and massive data sets are an important reason that modern AI approaches have been so successful. The graph in the middle column is showing data growth in Zettabytes (10 to the power of 21). At this rate of data growth we’ll run out metric system in a few years (Yotta is the only allocation after Zetta). The third characteristic of modern AI research is the emergence of vastly improved machine learning algorithms that are able to learn without being explicitly told what to learn. In the example here, the algorithm has coloured in the line drawings to create a pretty good photorealistic image, but without being taught any of the concepts i.e. human, face, colour, drawing, photo, etc.


We’re increasingly seeing evidence that in some very narrow domains of practice (e.g. reasoning and information recall), machine learning algorithms can outdiagnose experienced clinicians. It turns out that computers are really good at classifying patterns of variables that are present in very large datasets. And diagnosis is just a classification problem. For example, algorithms are very easily able to find sets of related signs and symptoms and put them into a box that we call “TB”. And increasingly, they are able to do this classification better than the best of us.


It is estimated that up to 60% of a doctors time is spent capturing information in the medical record. Natural language processing algorithms are able to “listen” to the ambient conversation between a doctor and patient, record the audio and transcribe it (translating it in the process if necessary). It then performs semantic analysis of the text (not just keyword analysis) to extract meaningful information which it can use to populate an electronic health record. While the technology is in a very early phase and not yet safe for real world application it’s important to remember that this is the worst it’s ever going to be. Even if we reach some kind of technological dead end with respect to machine learning and from now on we only increase efficiency, we are still looking at a transformational technology.


An algorithm recently passed the Chinese national medical exam, qualifying (in theory) as a physician. While we can argue that practising as a physician is more than writing a text-based exam, it’s hard not to acknowledge the fact that – at the very least – machines are becoming more capable in the domains of knowledge and reasoning that characterise much of clinical practice. Again, this is the worst that this technology is ever going to be.


This graph shows the number of AI applications under development in a variety of disciplines, including medicine (on the far right). The green segment shows the number of applications where AI is outperforming human beings. Orange segments show the number of applications that are performing relatively well, with blue highlighting areas that need work. There are two other points worth noting: medical AI is the area of research that is clearly showing the most significant advances (maybe because it’s the area where companies can make the most money); and all the way at the far left of the graph is education, showing that there may be some time before algorithms are showing the same progress in teaching.


Contrary to what we see in the mainstream media, AI is not a monolithic field of research; it consists it consists of a wide variety of different technologies and philosophies that are each sometimes referred to under the more general heading of “AI”. While much of the current progress is driven by machine learning algorithms (which is itself driven by the 3 characteristics of modern society highlighted earlier), there are many areas of development, each of which can potentially contribute to different areas of clinical practice. For the purposes of this presentation, we can define AI as any process that is able to independently achieve an objective within a narrowly constrained domain of interest (although the constraints are becoming looser by the day).


Machine learning is a sub-domain of AI research that works by exposing an algorithm to a massive data set and asking it to look for patterns. By comparing what it finds to human-tagged patterns in the data, developers can fine-tune the algorithm (i.e. “teach it) before exposing it to untagged data and seeing how well it performs relative to the training set. This generally describes the “learning” process of machine learning. Deep learning is a sub-domain of machine learning that works by passing data through many layers, allocating different weights to the data at each layer, thereby coming up with a statistical “answer” that expresses an outcome in terms of probability. Deep learning neural networks underlie many of the advances in modern AI research.


Because machine and deep learning algorithms are trained on (biased) human-generated datasets, it’s easy to see how the algorithms themselves will have an inherent bias embedded in the outputs. The Twitter screenshot shows one of the least offensive tweets from Tay, an AI-enabled chatbot created by Microsoft, which learned from human interactions on Twitter. In the space of a few hours, Tay became a racist, sexist, homophobic monster – because this is what it learned from how we behave on Twitter. This is more of an indictment of human beings than it is of the algorithm. The other concern with neural networks is that, because of the complexity of the algorithms and the number of variables being processed, human beings are unable to comprehend how the output was computed. This has important implications when algorithms are helping with clinical decision-making and is the reason that resources are being allocated to the development of what is known as “explainable AI”.


As a result of the changes emerging from AI-based technologies in clinical practice we will soon need to stop thinking of our roles in terms of “professions” and rather in terms of “tasks”. This matters because increasingly, many of the tasks we associate with our professional roles will be automated. This is not all bad news though, because it seems probable that increased automation of the repetitive tasks in our repertoire will free us up to take on more meaningful tasks, for example, having more time to interact with patients. We need to start asking what are the things that computers are better at and start allocating those tasks to them. Of course, we will need to define what we mean by “better”; more efficient, more cost-effective, faster, etc.


Another important change that will require the use of AI-based technologies in clinical practice will be the inability of clinicians to manage – let alone understand – the vast amount of information being generated by, and from, patients. Not only are all institutional tests and scans digital but increasingly, patients are creating their own data via wearables – and soon, ingestibles – all of which will require that clinicians are able to collect, filter, analyse and interpret these vast streams of information. There is evidence that, without the help of AI-based systems, clinicians simply will not have the cognitive capacity to understand their patients’ data.


The impact of more patient-generated health data is that we will see patients being in control of their data, which will exist on a variety of platforms (cloud storage, personal devices, etc.), none of which will be available to the clinician by default. This means that power will move to the patient as they make choices about who to allow access to their data in order to help them understand it. Clinicians will need to come to terms with the fact that they will no longer wield the power in the relationship and in fact, may need to work within newly constituted care teams that include data scientists, software engineers, UI designers and smart machines. And all of these interactions will be managed by the patient who will likely be making choices with inputs from algorithms.


The incentives for enthusiastic claims around developments in AI-based clinical systems are significant; this is an acdemic land grab the likes of which we have only rarely experienced. The scale of the funding involved puts pressure on researchers to exaggerate claims in order to be the first to every important milestone. This means that clinicians will need to become conversant with the research methods and philosophies of the data scientists who are publishing the most cutting edge research in the medical field. The time will soon come when it will be difficult to understand the language of healthcare without first understanding the language of computer science.


The implications for health professions educators are profound, as we will need to start asking ourselves what we are preparing our graduates for. When clinical practice is enacted in an intelligent environment and clinicians are only one of many nodes in vast information networks, what knowledge and skills do they need to thrive? When machines outperform human beings in knowledge and reasoning tasks, what is the value of teaching students about disease progression, for example? We may find ourselves graduating clinicians who are well-trained, competent and irrelevant. It is not unreasonable to think that the profession called “doctor” will not exist in 25 years time, having been superseded by a collective of algorithms and 3rd party service providers who provide more fine-grained services at a lower cost.


There are three new literacies that health professions educators will need to begin integrating into our undergraduate curricula. Data literacy, so that healthcare graduates will understand how to manage, filter, analyse and interpret massive sets of information in real-time; Technological literacy, as more and more of healthcare is enacted in digital spaces and mediated by digital devices and systems; and Human literacy, so that we can become better at developing the skillsets necessary to interact more meaningfully with patients.


There is evidence to suggest that, while AI-based systems outperform human beings on many of the knowledge and reasoning tasks that make up clinical practice, the combination of AI and human originality results in the most improved outcomes of all. In other words, we may find that patient outcomes are best when we figure out how to combine human creativity and emotional response with machine-based computation.


And just when we’re thinking that “creativity” and “originality” are the sole province of human beings, we’re reminded that AI-based systems are making progress in those areas as well. It may be that the only way to remain relevant in a constantly changing world is to develop our ability to keep learning.

Twitter Weekly Updates for 2012-04-16

Posted to Diigo 04/05/2012

  • Students working in small groups interact in a variety of ways and the teacher has an important role to play
  • Barriers, more often perceived than real, may impede the adoption of small group teaching
  • Small group learning is the learning that takes place when students work together usually in groups of 10 or less
  • What matters is that the group shows three characteristics.
  • Active participation: A key feature of small group work is that interaction should take place among all present.
  • A specific task: There should be a clearly defined task and objectives, and they should be understood by all members of the group.
  • Reflection: In small group learning, it is important to learn from an experience and to modify behaviour accordingly. Deep learning is a key feature of small group work: reflection is a key feature of deep learning.
  • It provides students with experience of working in a group, it helps them to acquire group skills. These include the ability to communicate effectively, the prioritising of tasks, the management of time and the exercise of interpersonal skills.
  • The introduction of small group work into a curriculum is frequently resisted. Various arguments are used
  • “Students do not like small group work”
  • Initial dissatisfaction with small group work may be expected and is natural. Barriers may be:
  • Perceptual
  • Cultural
  • Emotional
  • Intellectual
  • Environmental
  • “Staff do not know how to teach in small groups”
  • : Teachers may lack the skills necessary for running small group sessions. This may be seen as more of a problem by course or curriculum organisers than by teachers themselves. A staff development programme, however, is important.
  • : Staff shortage may be a real issue – or may be a misconception. The small group method adopted and the timetabling both have a significant impact. Identify all the staff who could be available for small group teaching.
  • “We do not have enough teachers for small group work”
  • “There are too few rooms”: Space is frequently a contentious issue. Creativity is usually the best solution. Do not be afraid to experiment – students are resilient.
  • “It is a waste of time – students do not learn anything”: It may take longer to cover a topic in small group work than in lectures. However, what really matters is if, and what, students learn and not just what is taught.
  • The following checklists have been generated for running small group work.
  • Consider the objectives of the session: Consider carefully the objectives of the session or course you are running and whether other teaching methods, eg lectures, independent learning, may be more appropriate
  • Determine your available physical resources: Accommodation availability may be a limiting factor. Small groups require suitable accommodation, which allows chairs to be set out in a circle to maximise interaction among the students, and space for appropriate audio-visual materials.
  • Determine the manpower availability: Small group teaching requires the participation of a larger number of teaching staff. Consider carefully how many teachers are available and their expertise as small group facilitators.
  • Does a facilitator have to be an expert? For small groups to function most effectively, the facilitator should have expertise in the content area and in small group facilitation.
  • Can different facilitators be used with the one group? Continuity of the facilitator is desirable. The facilitator can be considered as a member of the group and changing the composition of the group may disrupt the group dynamics.
  • Can a teacher facilitate several groups? A floating facilitator can be assigned responsibility for maintaining the task and function of more than one group. Success varies, frequently depending on the dynamics of the groups and the skill of the facilitator.
  • Can a student facilitate his or her own group? A student can successfully facilitate a peer group. This depends on the group’s dynamics, on the student’s ability and on the briefing given to the student.
  • Determine the group membership: Groups may be self selected, strategically determined, randomised, or selected alphabetically. It may be appropriate to change the membership of a group after a designated period of time, eg semester or year. Groups will work less productively if constantly changed, as they are less likely to reach a productive stage.
  • Ensure that the staff is prepared for the session: Staff must be given detailed briefings for the small group work. They should have prior opportunities to see small group work in action and to attend staff development sessions.
  • Select the most appropriate small group method: The educational objectives should determine the choice of small group method.
  • Develop stimulus material: Stimulus material might include problem-based scenarios, video clips, a set of questions, key articles for exploration, a real or simulated patient. The stimulus material required will be determined by the small group method adopted.
  • Inform students about the role of the small group work: Students should be informed why small group work is being adopted and how it relates to other activities and to the learning outcomes for the course.
  • During the Small Group Activity
  • Allow adequate introductions – use ice-breakers if necessary
  • Ensure that the students understand what to do, why they are doing it and how they should achieve it
  • Facilitate learning: The facilitator’s objective is to help the student become more self-reliant and independent by establishing a climate that is open, trustful and supportive. An educational facilitator’s role has two distinct areas: maintaining the functioning of the group and ensuring the task is completed. Facilitators must understand the changing dynamics of a group through four stages: forming, storming, norming, and performing
  • Debrief the group on the activity: Debriefing summarises or clarifies what has been learnt and may take as long as the activity itself. During debriefing, constructive feedback may be given
  • After the Small Group Activity
  • Evaluate the success of the session: There are two aspects – achievement and quality. Have the objectives been achieved? Was the educational experience of a high standard? Students may be asked to complete an evaluation questionnaire. Alternatively, the session can be peer reviewed. Teachers should consciously self-evaluate the session.
  • Reflect on the experience: Evaluation, formal or informal, is pointless if no change in practice results.
  • Small group methods have a valuable role to play in undergraduate medical curricula. The student-centred focus and active participation enhance the likelihood of deep rather than surface learning. Decisions to be made are how much time to schedule for small group work, and what kind of small group work to adopt. If small groups are used, they must be valued and must not be seen as isolated from other aspects of the curriculum – or from the culture of the medical school.
  • Staff development is important. Success of small group learning depends on good planning, effective facilitation and a movement away from teacher-centred learning. For a real and sustained shift in medical education, students must be encouraged to learn rather than merely catching the output of teachers.

Technology in the classroom: can we make it work?

I’ve been trying to think how to use technology to enhance both my teaching and my students’ learning and it’s proving more difficult than I’d initially thought.  I like to think that laptops and internet access in every classroom give students real-time access to related content while they engage in meaningful discussion, but this will never happen.  Their Facebook profile and IM conversations are far more interesting than the “Pathology of stroke” or “Justice in access to healthcare”.  And that makes sense in a bizarre kind of way.  Even while they (or their parents) pay vast sums in tuition fees to have the privilege of attending university, most students (in my very limited experience) see studying as inherently boring.

Some studies in American classrooms have all but proven that the distraction of the Internet in class is too strong for students to ignore and that most of the lesson is spent checking email, catching up with friends and even shopping.  Now, after that initial foray into “embracing” technology”, it seems as if there’s a move towards banning laptops altogether.

This is the kind of about-turn I’d like to avoid.  E-learning, while I have no doubt will be a revolution in education, is not the idea that technology for it’s own sake is the way forward.  Just because it’s possible to have Internet access in class, does it mean that we should?  Rather, teachers must take an approach whereby technology is used in a way that enhances it’s advantages, while minimising the disadvantages.  Just because I put the course reader online doesn’t make it “e-learning”, and neither does having a student blog.  The technology in itself doesn’t enhance learning in any way, but how you use it can have powerful implications.

I’ve been toying with the idea of using a wiki to manage a course, whereby any change to either the course content, test schedule or mark availability can by syndicated through RSS to all the students in the class.  Students will have to, as a course requirement, both add to and edit course content (obviously moderated), which can also then be tracked.  I think that this may be one way to encourage them to actively engage with the content, as well as introduce concepts like peer review, referencing and drafting, which may also improve their reading and writing skills (another huge problem).  The point though, will be to make the learning outcomes apparent from the beginning, so that students know what’s expected of them.  Merely creating a wiki and telling students to “Go forth and create content” isn’t enough.

I think that technology will fundamentally change the way we teach and how students learn, but not just by throwing technology at the problem.  The trick is to figure out how to use technology to facilitate deep learning by getting students to actively engage with the content.  A bad teacher will continue to teach badly, no matter how much “technology” they use.

Link to the article that inspired this post:
http://www.britannica.com/blogs/2008/10/why-i-ban-laptops-in-my-classroom/