Michael Rowe

Trying to get better at getting better

STORM writes Wikipedia-like articles from scratch

Note: This isn’t a critical analysis of the implications of systems like STORM; I haven’t yet thought about what it means when students and teachers are generating content like this. This post is simply a description of what STORM is and what it does.


https://storm.genie.stanford.edu

STORM is a LLM system that writes Wikipedia-like articles from scratch based on Internet search. Co-STORM further enhanced its feature by enabling human to collaborative LLM system to support more aligned and preferred information seeking and knowledge curation. While the system cannot produce publication-ready articles that often require a significant number of edits, experienced Wikipedia editors have found it helpful in their pre-writing stage. – GitHub project page

When you login to STORM you get a Library with any reports that you’ve generated, as well as access to a Discover section that includes content that others have generated. I can imagine a system where the Discover section could be filtered by category or, if you were running a local version of this, limited to a specific student cohort.

Co-STORMing

I really like the notion of co-STORMING, where the report is generated based on a conversation between authors. Here’s an article I found in the Discovery section, based on a conversation (I wasn’t sure if the conversation was only between bots / personas, or if it included humans).

Here you can see the persona contributions, and it may be that scrolling down further would show human authors.

And here’s the final article, based on the conversation above. It looks interesting, and I’ve saved it to read later.

How it works

Here is a question I asked STORM to use as the foundation for it’s article: What is the most likely impact of generative AI on the higher education system, taken to it’s logical end-point? And I asked it to create the report independently i.e. with no input from me.

This is what it produced.

The screenshot below shows what you get when you click on See BrainSTORMing Process. You can see the list of AI personas running along the top of the report, with a description of what each persona contributed to the discussion that informed the report. It’s a really interesting approach to generating multiple perspectives on a concept, and then working to balance those perspectives. Side note: I do a version of this quite often, where I take the outputs from multiple generative AI systems, combine them into a single text file, and then ask Claude to make sense of it all. In that case, there isn’t much difference because I haven’t asked asked any of the models to take different perspectives.

Here’s the PDF of the article it generated.

First of all, the fact that it created something at all is fairly incredible. Remember that it received no guidance, other than the initial question. It created a document that’s comprehensive, coherent, well-structured, and absolutely something that you could use as a starting point to learn about a topic.

Neutral opinion

However, I found the actual content of the report quite disappointing, as it provides a comprehensive overview of the current state and near-future implications of generative AI in higher education. It gave me a palatable, sanitised, and uninteresting response that mirrors the majority of chatter online about how we should respond to genAI in the higher education sector.

From the end of the summary:

This shift calls for an emphasis on AI literacy and the importance of maintaining human oversight in educational environments, ensuring that technology enhances rather than detracts from the learning experience. As the higher education landscape continues to evolve in response to generative AI, it is imperative for institutions to balance innovation with ethical considerations. Engaging with the benefits and risks associated with GenAI will not only shape the future of academic practice but also determine the roles that students and educators will play in an increasingly AI-driven world.

I wonder how far you could push STORM to write something that sits a little outside the Overton Window? I didn’t try, but it would be good to know if it’s always going to present the most balanced version of a topic, or if you can push it to be more opinionated.

Having said all that, even though I was disappointed in the report, I would absolutely recommend trying STORM for any topic where you want a neutral perspective on a topic.

Here’s the research paper presenting the development of STORM: Shao, Y., Jiang, Y., Kanell, T. A., Xu, P., Khattab, O., & Lam, M. S. (2024). Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models. ArXiv.org

Abstract

We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages. This underexplored problem poses new challenges at the pre-writing stage, including how to research the topic and prepare an outline prior to writing. We propose STORM, a writing system for the Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking. STORM models the pre-writing stage by (1) discovering diverse perspectives in researching the given topic, (2) simulating conversations where writers carrying different perspectives pose questions to a topic expert grounded on trusted Internet sources, (3) curating the collected information to create an outline.‌


Share this


Discover more from Michael Rowe

Subscribe to get the latest posts to your email.