My name is Michael Rowe. I try to figure out better ways to teach.
Author: Michael Rowe
I'm the Associate Professor for Digital Innovation in the School of Health and Social Care at the University of Lincoln in the UK. I'm interested in technology, education, and healthcare, and look for the places where these things intersect.
The AI Pedagogy Project (by metaLAB at Harvard) provides educators with a collection of resources in 3 parts:
AI starter: A solid overview of generative AI.
LLM tutorial: Excellent tutorial exploring language models.
Resources: A collection of curated resources that look like a great place to start.
The project aims to encourage creative and critical engagement with AI in education through assignments and materials inspired by the humanities.
There is also a collection of sample assignments (see below) from educators to spark informed conversations about the risks, benefits, and impacts of AI tools. And a newsletter for updates on new assignments that have been added and other ways to get involved.
Our latest model, Claude 2.1, is now available over API in our Console and is powering our claude.ai chat experience. Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use…
We’re doubling the amount of information you can relay to Claude with a limit of 200,000 tokens, translating to roughly 150,000 words, or over 500 pages of material.
Claude 2.1 has also made significant gains in honesty, with a 2x decrease in false statements compared to our previous Claude 2.0 model.
Claude 2.1 has also made meaningful improvements in comprehension and summarization, particularly for long, complex documents that demand a high degree of accuracy, such as legal documents, financial reports and technical specifications.
These all seem like fantastic improvements, especially the bit about increased accuracy and being more likely to demur rather than provide incorrect information (for example, reporting “I’m not sure…”.
However, note that independent experiments show that, while “…Claude 2.1 was able to extract facts at the beginning and end of a document with almost 100% accuracy for 35 queries…the performance of the model drops off sharply above about 90,000 tokens, especially for information in the middle and bottom of the document”.
What this means is that information retrieval is unreliable for larger context windows, especially when reliability is important.
The takeaways for me are:
Claude 2.1 is a significant improvement over 2.0.
There are still many situations when you can’t trust the outputs of any LLM.
Scientists are increasingly overwhelmed by the volume of articles being published. Total articles indexed in Scopus and Web of Science have grown exponentially in recent years; in 2022 the article total was 47% higher than in 2016, which has outpaced the limited growth, if any, in the number of practising scientists. Thus, publication workload per scientist (writing, reviewing, editing) has increased dramatically.
The first thought I had when I saw this was, “How many of those articles are actually worth reading?” The perverse incentive to publish more articles, more quickly, isn’t associated with an increase in our ability to produce high-quality work.
I’ve said it before, most books could be papers, most papers could be blog posts, and most blog posts could be tweets.
One of the ideas that makes generative AI so powerful is that it takes me exactly the same amount of time to create a 250-word summary, or a 1000-word summary of a document. For example, if I want to share an overview of a report I’ve written, or a lecture or presentation I’ve given, the amount of time I need to spend on writing that summary is dependent on how long I want it to be. Generative AI means that I can scale up what I want to produce, without it taking me any more time.
Another thought that occurs to me is that you can also create multiple types of output based on the same source (e.g. “write me an academic abstract”, “write me a series of tweets”, “write me a blog post”) with very little additional input. You could upload a single PDF, and with one well-structured prompt, create a wide range of content that you could use to share in multiple formats.
Where else in your life does output scale so much with no extra input?
“Quantum Leap is building the world’s best system for rapidly acquiring expertise. Our first courses will be on large language models and AI safety, for which we’re aiming to compress a PhD and several years’ experience into 3-6 months using accelerated learning methods developed by the US military.”
Whether you agree that it’s possible or not, I don’t see universities pushing these kinds of boundaries.
I’m seeing plenty of calls for institutions to reform their assessments in the face of generative AI (see here, for example). Which is fine, I suppose. Nothing wrong with assessment reform.
But changing assessment practices without reforming the system in which it operates is just painting over the cracks. Or to put it more crudely, we’re putting lipstick on a pig. It’s still a pig.
I’ve touched on this before, when I thought we were focusing our attention on the superficial aspects of learning and teaching, rather than changes in infrastructure. Higher education has a history of tinkering on the edges of the problem without committing to the more difficult, but more meaningful changes.
Assessment reform seems to be what we want. But education reform is what we need.
This is a great example of why you should avoid focusing on what AI (or any technology) can’t do. The statement above was true when it was published, but it’s no longer true. ChatGPT can now read your institution’s hiring policy, point out inconsistencies and areas of misalignment with the stated values of the institution, and write an email to the Dean in any number of tones you choose.
Every time we position our abilities in the gaps of things that AI can’t do, we paint ourselves into increasingly smaller corners, because next month AI will be able to do the things we said were ours. And when you position yourself alongside the list of things that AI can’t do, your value decreases as AI becomes more capable.
Instead of defining your role in terms of what AI can’t do, it’s a far more effective strategy to define your role in terms of what you can do with AI. This way, as the capabilities of AI increases, so does your career capital.