Comment: Training a single AI model can emit as much carbon as five cars in their lifetimes

The results underscore another growing problem in AI, too: the sheer intensity of resources now required to produce paper-worthy results has made it increasingly challenging for people working in academia to continue contributing to research. “This trend toward training huge models on tons of data is not feasible for academics…because we don’t have the computational resources. So there’s an issue of equitable access between researchers in academia versus researchers in industry.”

Hao, K. (2019). Training a single AI model can emit as much carbon as five cars in their lifetimes. MIT Technology Review.

The article focuses on the scale of the financial and environmental cost of training natural language processing (NLP) models, comparing the carbon emissions of various AI models to those of a car throughout its lifetime. To be honest, this isn’t something I’ve given much thought to but to see it visually really drives the point home.

As much as this is a cause for concern, I’m less worried about this in the long term for the following reason. As the author’s in the article stake, the code and models for AI and NLP are currently really inefficient; they don’t need to be neat and compute is relatively easy to come by (if you’re Google and Facebook). I think that the models will get more efficient, as is evident by the fact that new computer vision algorithms can get to the same outcomes with datasets that are orders of magnitude smaller than was previously possible.

For me though, the quote that I’ve pulled from the article to start this post is more compelling. If the costs of modeling NLP are so high, it seems likely that companies like Google, Facebook and Amazon will be the only ones who can do the high end research necessary to drive the field forward. Academics at universities have an incentive to create more efficient models, which they publish and which then allow companies to take advantage of those new models while at the same time, having access to much more computational resources.

From where I’m standing this makes it seem that private companies will always be at the forefront of AI development, which makes me less optimistic than if it were driven by academics. Maybe I’m just being naive (and probably also biased) but this seems less than ideal.

You can find the full paper here on arxiv.

10 recommendations for the ethical use of AI

In February the New York Times hosted the New Work Summit, a conference that explored the opportunities and risks associated with the emergence of artificial intelligence across all aspects of society. Attendees worked in groups to compile a list of recommendations for building and deploying ethical artificial intelligence, the results of which are listed below.

  1. Transparency: Companies should be transparent about the design, intention and use of their A.I. technology.
  2. Disclosure: Companies should clearly disclose to users what data is being collected and how it is being used.
  3. Privacy: Users should be able to easily opt out of data collection.
  4. Diversity: A.I. technology should be developed by inherently diverse teams.
  5. Bias: Companies should strive to avoid bias in A.I. by drawing on diverse data sets.
  6. Trust: Organizations should have internal processes to self-regulate the misuse of A.I. Have a chief ethics officer, ethics board, etc.
  7. Accountability: There should be a common set of standards by which companies are held accountable for the use and impact of their A.I. technology.
  8. Collective governance: Companies should work together to self-regulate the industry.
  9. Regulation: Companies should work with regulators to develop appropriate laws to govern the use of A.I.
  10. “Complementarity”: Treat A.I. as tool for humans to use, not a replacement for human work.

The list of recommendations seems reasonable enough on the surface, although I wonder how practical they are given the business models of the companies most active in developing AI-based systems. As long as Google, Microsoft, Facebook, etc. are generating the bulk of their revenue from advertising that’s powered by the data we give them, they have little incentive to be transparent, to disclose, to be regulated, etc. If we opt our data out of the AI training pool, the AI is more susceptible to bias and less useful/accurate, so having more data is usually better for algorithm development. And having internal processes to build trust? That seems odd.

However, even though it’s easy to find issues with all of these recommendations it doesn’t mean that they’re not useful. The more of these kinds of conversations we have, the more likely it is that we’ll figure out a way to have AI that positively influences society.

Comment: Why AI is a threat to democracy—and what we can do to stop it

The developmental track of AI is a problem, and every one of us has a stake. You, me, my dad, my next-door neighbor, the guy at the Starbucks that I’m walking past right now. So what should everyday people do? Be more aware of who’s using your data and how. Take a few minutes to read work written by smart people and spend a couple minutes to figure out what it is we’re really talking about. Before you sign your life away and start sharing photos of your children, do that in an informed manner. If you’re okay with what it implies and what it could mean later on, fine, but at least have that knowledge first.

Hao, K. (2019). Why AI is a threat to democracy—and what we can do to stop it. MIT Technology Review.

I agree that we all have a stake in the outcomes of the introduction of AI-based systems, which means that we all have a responsibility in helping to shape it. While most of us can’t be involved in writing code for these systems, we can all be more intentional about what data we provide to companies working on artificial intelligence and how they use that data (on a related note, have you ever wondered just how much data is being collected by Google, for example?). Here are some of the choices I’ve made about the software that I use most frequently:

  • Mobile operating system: I run LineageOS on my phone and tablet, which is based on Android but is modified so that the data on the phone stays on the phone i.e. is not reported back to Google.
  • Desktop/laptop operating system: I’ve used various Ubuntu Linux distributions since 2004, not only because Linux really is a better OS (faster, cheaper, more secure, etc.) but because open-source software is more trustworthy.
  • Browser: I switched from Chrome to Firefox with the release of Quantum, which saw Firefox catch up in performance metrics. With privacy as the default design consideration, it was an easy move to make. You should just switch to Firefox.
  • Email: I’ve looked around – a lot – and can’t find an email provider to replace Gmail. I use various front-ends to manage my email on different devices but that doesn’t get me away from the fact that Google still processes all of my emails on the back-end. I could pay for my email service provider – and there do seem to be good options – but then I’d be paying for email.
  • Search engine: I moved from Google Search to DuckDuckGo about a year ago and can’t say that I miss Google Search all that much. Every now and again I do find that I have to go to Google, especially for images.
  • Photo storage: Again, I’ve looked around for alternatives but the combination of the free service, convenience (automatic upload of photos taken on my phone), unlimited storage (for lower res copies) and the image recognition features built into Google Photos make this very difficult to move away from.
  • To do list: I’ve used Todoist and Any.do on and off for years but eventually moved to Todo.txt because I wanted to have more control over the things that I use on a daily basis. I like the fact that my work is stored in a text file and will be backwards compatible forever.
  • Note taking: I use a combination of Simplenote and Qownnotes for my notes. Simplenote is the equivalent of sticky notes (short-term notes that I make on my phone and delete after acting on them), and Qownnotes is for long-form note-taking and writing that stores notes as text files. Again, I want to control my data and these apps give me that control along with all of the features that I care about.
  • Maps: Google Maps is without equal and is so far ahead of anyone else that it’s very difficult to move away from. However, I’ve also used Here We Go on and off and it’s not bad for simple directions.

From the list above you can see that I pay attention to how my data is stored, shared and used, and that privacy is important to me. I’m not unsophisticated in my use of technology and I still can’t get away from Google for email, photos, and maps, arguably the most important data gathering services that the company provides. Maybe there’s something that I’m missing out but companies like Google, Facebook, Amazon and Microsoft are so entangled in everything that we care about, I really don’t see a way to avoid using their products. The suggestion that users should be more careful about what data they share, and who they share it with, is a useful thought experiment but the practical reality is that it would very difficult indeed to avoid these companies altogether.

Google isn’t only problem. See what Facebook knows about you.

Comment: Facebook says it’s going to make it harder to access anti-vax misinformation

Facebook won’t go as far as banning pages that spread anti-vaccine messages…[but] would make them harder to find. It will do this by reducing their ranking and not including them as recommendations or predictions in search.

Firth. N. (2019). Facebook says it’s going to make it harder to access anti-vax misinformation. MIT Technology Review.

Of course this is a good thing, right? Facebook – already one of the most important ways that people get their information – is going to make it more difficult for readers to find information that opposes vaccination. With the recent outbreak of measles in the United States we need to do more to ensure that searches for “vaccination” don’t also surface results encouraging parents not to vaccinate their children.

But what happens when Facebook (or Google, or Microsoft, or Amazon) start making broader decisions about what information is credible, accurate or fake? That would actually be great if we could trust their algorithms. But trust requires that we’re allowed to see the algorithm (and also that we can understand it, which in most cases, we can’t). In this case, it’s a public health issue and most reasonable people would see that the decision is the “right” one. But when companies tweak their algorithms to privilege certain types of information over other types of information, then I think we need to be concerned. Today we agree with Facebook’s decision but how confident can we be that we’ll still agree tomorrow?

Also, vaccines are awesome.

First compute no harm

Is it acceptable for algorithms today, or an AGI in a decade’s time, to suggest withdrawal of aggressive care and so hasten death? Or alternatively, should it recommend persistence with futile care? The notion of “doing no harm” is stretched further when an AI must choose between patient and societal benefit. We thus need to develop broad principles to govern the design, creation, and use of AI in healthcare. These principles should encompass the three domains of technology, its users, and the way in which both interact in the (socio-technical) health system.

Source: Enrico Coiera et al. (2017). First compute no harm. BMJ Opinion.

The article goes on to list some of the guiding principles for the development of AI in healthcare, including the following:

  • AI must be designed and built to meet safety standards that ensure it is fit for purpose and operates as intended.
  • AI must be designed for the needs of those who will work with it, and fit their workflows.
  • Humans must have the right to challenge an AI’s decision if they believe it to be in error.
  • Humans should not direct AIs to perform beyond the bounds of their design or delegated authority.
  • Humans should recognize that their own performance is altered when working with AI.
  • If humans are responsible for an outcome, they should be obliged to remain vigilant, even after they have delegated tasks to an AI.

The principles listed above are only a very short summary. If you’re interested in the topic of ethical decision making in clinical practice, you should read the whole thing.

My presentation for the Reimagine Education conference

Here is a summarised version of the presentation I’m giving later this morning at the Reimagine Education conference. You can download the slides here.

How to ensure safety for medical artificial intelligence

When we think of AI, we are naturally drawn to its power to transform diagnosis and treatment planning and weigh up its potential by comparing AI capabilities to those of humans. We have yet, however, to look at AI seriously through the lens of patient safety. What new risks do these technologies bring to patients, alongside their obvious potential for benefit? Further, how do we mitigate these risks once we identify them, so we can all have confidence the AI is helping and not hindering patient care?

Source: Coiera, E. (2018). How to ensure safety for medical artificial intelligence.

Enrico Coiera covers a lot of ground (albeit briefly) in this short post:

  • The prevalence of medical error as a cause of patient harm
  • The challenges and ethical concerns that are inherent in AI-based decision-making around end-of-life care
  • The importance of high-quality training data for machine learning algorithms
  • Related to this, the challenge of poor (human) practice being encoded into algorithms and so perpetuated
  • The risk of becoming overly reliant on AI-based decisions
  • Limited transferability when technological solutions are implemented in different contexts
  • The importance of starting with patient safety in algorithm decision, rather than adding it later

If you use each of the points in the summary above, there’s enough of a foundation in this article to really get to grips with some of the most interesting and challenging areas of machine learning in clinical practice. It might even be a useful guide to building an outline for a pretty comprehensive research project.

For more thoughts on developing a research agenda in related topics, see: AMA passes first policy guidelines on augmented intelligence.

Note: you should check out Enrico’s Twitter feed, which is a goldmine for cool (but appropriately restrained) ideas around machine learning in clinical practice.

Another Terrible Idea from Turnitin | Just Visiting

Allowing the proliferation of algorithmic surveillance as a substitution for human engagement and judgment helps pave the road to an ugly future where students spend more time interacting algorithms than instructors or each other. This is not a sound way to help writers develop robust and flexible writing practices.

Source: Another Terrible Idea from Turnitin | Just Visiting

First of all, I don’t use Turnitin and I don’t see any good reason for doing so. Combating the “cheating economy” doesn’t depend on us catching the students; it depends on creating the conditions in which students believe that cheating offers little real value relative to the pedagogical goals they are striving for. In general, I agree with a lot that the author is saying.

So, with that caveat out of the way, I wanted to comment on a few other pieces in the article that I think make significant assumptions and limit the utility of the piece, especially with respect to how algorithms (and software agents in particular) may be useful in the context of education.

  • The use of the word “surveillance” in the quote above establishes the context for the rest of the paragraph. If the author had used “guidance” instead, the tone would be different. Same with “ugly”; remove that word and the meaning of the sentence is very different. It just makes it clear that the author has an agenda which clouds some of the other arguments about the use of algorithms in education.
  • For example, the claim that it’s a bad thing for students to interact with an algorithm instead of another person is empirical; it can be tested. But it’s presented here in a way that implies that human interaction is simply better. Case closed. But what if we learned that algorithmic guidance (via AI-based agents/tutors) actually lead to better student outcomes than learning with/from other people? Would we insist on human interaction because it would make us feel better? Why not test our claims by doing the research before making judgements?
  • The author uses a moral argument (at least, this was my take based on the language used) to position AI-based systems (specifically, algorithms) as being inherently immoral with respect to student learning. There’s a confusion between the corporate responsibility of a private company – like Turnitin – to make a profit, and the (possibly pedagogically sound) use of software agents to enhance some aspects of student learning.

Again, there’s some good advice around developing assignments and classroom conditions that make it less likely that students will want to cheat. This is undoubtedly a Good Thing. However, some of the claims about the utility of software agents are based on assumptions that aren’t necessarily supported by the evidence.

Critical digital pedagogy in the classroom: Practical implementation

Update (12-02-18): You can now download the full chapter here (A critical pedagogy for online learning in physiotherapy education) and the edited collection here.

This post is inspired by the work I’ve recently done for a book chapter, as well as several articles on Hybrid Pedagogy but in particular, Adam Heidebrink-Bruno’s Syllabus as Manifesto. I’ve been wanting to make some changes to my Professional Ethics module for a while and the past few weeks have really given me a lot to think about. Critical pedagogy is an approach to teaching and learning that not only puts the student at the centre of the classroom but then helps them to figure out what to do now that they’re there. It also pushes teachers to go beyond the default configurations of classroom spaces. Critical digital pedagogy is when we use technology to do things that are difficult or impossible in those spaces without it.

One of the first things we do in each module we teach is provide students with a course overview, or syllabus. We don’t even think about it but this document might be the first bit of insight into how we define the space we’re going to occupy with our students. How much thought do we really give to the language and structure of the document? How much of it is informed by the students’ voice? I wondered what my own syllabus would look like if I took to heart Jesse Stommel’s suggestion that we “begin by trusting students”.

I wanted to find out more about where my students come from, so I created a shared Google Doc with a very basic outline of what information needed to be included in a syllabus. I asked them to begin by anonymously sharing something about themselves that they hadn’t shared with anyone else in the class before. Something that influenced who they are and how they came to be in that class. I took what they shared, edited it and created the Preamble to our course outline, describing our group and our context. I also added my own background to the document, sharing my own values, beliefs and background, as well as positioning myself and my biases up front. I wanted to let them know that, as I ask them to share something of themselves, so will I do the same.

The next thing were the learning outcomes for the modules. We say that we want our students to take responsibility for their learning but we set up the entire programme without any input from them. We decide what they will learn based on the outcomes we define, as well as how it will be assessed. So for this syllabus I included the outcomes that we have to have and then I asked the students to each define what “success” looks like in this module for them. Each student described what they wanted to achieve by the end of the year, wrote it as a learning outcome, decided on the indicators of progress they needed, and then set timelines for completion. So each of them would have the learning outcomes that the institution and professional body requires, plus one. I think that this goes some way toward acknowledging the unique context of each student, and also gives them skills in evaluating their own development towards goals that they set that are personally meaningful.

I’ve also decided that the students will decide their own marks for these personal outcomes. At the end of the year they will evaluate their progress against the performance indicators that they have defined, and give themselves a grade that will count 10% towards their Continuous Assessment mark. This decision was inspired by this post on contract grading from HASTAC. What I’m doing isn’t exactly the same thing but it’s a similar concept in that students not only define what is important to them, but decide on the grade they earn. I’m not 100% how this will work in practice, but I’m leaning towards a shared document where students will do peer review on each other’s outcomes and progress. I’m interested to see what a student-led, student-graded, student-taught learning outcome looks like.

Something that is usually pretty concrete in any course is the content. But many concepts can actually be taught in a wide variety of ways and we just choose the ones that we’re most familiar with. For example the concept of justice (fairness) could be discussed using a history of the profession, resource allocation for patients, Apartheid in South Africa, public and private health systems, and so on. In the same shared document I asked students to suggest topics they’d like to cover in the module. I asked them to suggest the things that interest them, and I’d figure out how to teach concepts from professional ethics in those contexts. This is what they added: Income inequality. Segregation. #FeesMustFall. Can ethics be taught? The death penalty. Institutional racism. Losing a patient. That’s a pretty good range of topics that will enable me to cover quite a bit of the work in the module. It’s also more likely that students will engage considering that these are the things they’ve identified as being interesting.

Another area that we have completely covered as teachers is assessment. We decide what will be assessed, when the assessment happens, how it is graded, what formats we’ll accept…we even go so far as to tell students where to put the full stops and commas in their referencing lists. That’s a pretty deep level of control we’re exerting. I’ve been using a portfolio for assessment in this module for a few years so I’m at a point where I’m comfortable with students submitting a variety of different pieces. What I’m doing differently this year is asking the students to submit each task when it’s ready rather than for some arbitrary deadline. They get to choose when it suits them to do the work, but I have asked them to be reasonable with this, mainly because if I’m going to give them decent feedback I need time before their next piece arrives. If they’re submitted all at once, there’s no time to use the feedback to improve their next submission.

The students then decided what our “rules of engagement” would be in the classroom. Our module guides usually have some kind of prescription about what behaviour is expected, so I asked the students what they thought appropriate behaviour looks like and then to commit as a class to those rules. Unsurprisingly, their suggestions looked a lot like it would have if I had written it myself. Then I asked them to decide how to address situations when individuals contravened our rules. I don’t want to be the policeman who has to discipline students…what would it look like if students decided in advance what would work in their classroom, and then took action when necessary? I’m pretty excited to find out.

I decided that there would be no notes provided for this module, and no textbook either. I prepare the lecture outline in a shared Google document, including whatever writing assignments the students need to work on and links to open access resources that are relevant for the topic. The students take notes collaboratively in the document, which I review afterwards. I add comments and structure to their notes, and point them to additional resources. Together, we will come up with something unique describing our time together. Even if the topic is static our conversations never are, so any set of notes that focuses only on the topic is going to necessarily leave out the sometimes wonderful discussion that happens in class. This way, the students get the main ideas that are covered, but we also capture the conversation, which I can supplement afterwards.

Finally, I’ve set up a module evaluation form that is open for comment immediately and committed to having it stay open for the duration of the year. The problem with module evaluations is that we ask students to complete them at the end of the year, when they’re finished and have no opportunity to benefit from their suggestions. I wouldn’t fill it in either. This way, students get to evaluate me and the module at any time, and I get feedback that I can act on immediately. I use a simple Google Form that they can access quickly and easily, with a couple of rating scales and an option to add an open-ended comment. I’m hoping that this ongoing evaluation option in a format that is convenient for students means that they will make use of it to improve our time together.

As we worked through the document I could see students really struggling with the idea that they were being asked to contribute to the structure of the module. Even as they commented on each other’s suggestions for the module, there was an uncertainty there. It took a while for them to be comfortable saying what they wanted. Not just contributing with their physical presence in the classroom, but to really contribute in designing the module; how it would be run, how they would be assessed, how they could “be” in the classroom. I’m not sure how this is going to work out but I felt a level of enthusiasm and energy that I haven’t felt before. I felt a glimmer of something real as they started to take seriously my offer to take them seriously.

The choices above demonstrate a few very powerful additions to the other ways that we integrate technology into this module (the students portfolios are all on the IEP blog, they do collaborative authoring and peer review in Google Drive, course resources are shared in Drive, they do digital stories for one of the portfolio submissions, and occasionally we use Twitter for sharing interesting stories). It makes it very clear to the students that this is their classroom and their learning experiences. I’m a facilitator but they get to make real choices that have a real impact in the world. They get to understand and get a sense of what it feels like to have power and authority, as well as the responsibility that comes with that.

IPE course project update

This post is cross-posted from the International Ethics Project site.

My 4th year students have recently completed the first writing task in the IEP course pilot project. I thought I’d post a quick update on the process using screenshots to illustrate how the course is being run. We’re using a free version of WordPress which has certain limitations. For example it’s hard to manage different cohorts of students, but there are many more advantages, which I’ll write about in another post.

My students will keep writing for their portfolios using the course website, which I’ll keep updating and refining based on our experiences. The idea is that by the end of the year we’ll have figured out how to use the site most effectively for students to work through the course for the project.