Language model hallucination can still be accurate

I wanted to test if Claude AI could read and summarise an article when only given a URL. According to the response from the model, Claude can’t visit links. However, its summary of the article at the URL is spot on. Like, really good.

So either Claude is lying and can visit links, or it’s incredible at inferring the content of the article based only on the URL. In other words, if the summary is correct, and Claude can’t visit URLs, then all the generated content in the screenshot above is the product of hallucination.

Claude returned 7 bullet points, and an 8th summary point, which feels like a lot of information that it gleaned only from the URL. However, Claude makes sure to highlight that the response “may contain hallucination” i.e. these responses may be made up. In my other experiments with Claude, I’ve never seen this qualification show up, which makes me think that these results are simply made up.

But if they’re made up, they’re still remarkably accurate, considering the content of the original article.

The first bullet point can easily be inferred from the information in the URL, and basically just restates the keywords in the slug. The .co.za domain adds the “South Africa” context.

Points 2-5 can be inferred from the model’s understanding of common issues with “opinion” pieces on “archaic matric exam systems”, and their associated links to emotional and psychological distress. So, just using the URL as a prompt, it seems as if Claude would be able to generate some common summary points based on the URL information alone.

However, point 6 (“advocates for a more progressive exam system”) is different. There’s nothing in the URL that says anything about this. And even though it seems reasonable to assume an opinion piece would include this information, it’s not a given. The original article kind-of talks about other approaches to the exam system, so it’s still not clear if the response is a hallucination.

It’s only when we get to point 7 (the suggestion for policymakers to consider international practice), that it seems more likely that Claude is bullshitting us. There is no mention of this, or anything like it, in the original piece. I think that Claude is right: it cannot visit URLs, and the response definitely contains hallucination.

There are a few takeaways for me from this experiment:

LLM responses can be incredibly accurate, even when the generated response is entirely made up.
Even though Claude told me it couldn’t visit URLs, I initially thought that it must be wrong (or lying) because the summary it gave me was so reasonable. In retrospect, after having gone back and forth between the two pieces a few more times, I found that the summary was very limited. Which it obviously was, because it was based only on the information in the URL. But my first thought was that Claude could visit URLs and that it was wrong to say it couldn’t. This seems like a weird conclusion for me to reach, and I’ll need to think about this further.
The response from Claude should be more clear, and state that the summary it provided isn’t really a summary, but rather a series of best guesses about what such an article might include.

No longer active

Language model hallucination can still be accurate

Like this:

Discover more from Michael Rowe