Hallucination is a category error, not a tuning problem.

May 7, 2026

The AI industry treats hallucination as a tuning problem. Better models, better RLHF, better grounding, better evaluation — eventually we’ll dial the rate down to acceptable. Quarterly research updates report hallucination percentages alongside benchmark scores, as if they were measuring the same kind of thing.

We think this is a category error. A model that generates plausible text without a verifiable source is a different kind of thing from a model that retrieves and cites. Pretending they’re the same product with different quality scores is how the AI industry got into the trust mess it’s currently trying to dig out of. The fix is not a sharper knob. It’s a different room.

Two kinds of AI, treated as one

There is the AI that generates without grounding. It’s been trained on a corpus, learned the statistical shape of text, and predicts tokens. It has no obligation to track where its output came from because there is no where — the corpus is statistical, not retrievable. When this kind of model is wrong, the wrongness is intrinsic to the product behavior. Asking it to stop hallucinating is asking it to stop being itself.

Then there is the AI that retrieves and cites. It pulls answers from a known source set and points the user back to the source. The output is grounded by construction. Hallucination rate applied here is a category error of its own — there is no hallucinating, because either the source supports the claim or it doesn’t. When the source doesn’t support the claim, the model says so. When the source isn’t there at all, the model declines.

These two products look identical to a buyer — same chat interface, same prompt-and-response rhythm, same screenshots in the marketing material. They are not the same. They are different categories of software, treated as one because the chat box that fronts them is the same chat box.

Why the confusion is fine for consumer chat and malpractice for everything else

For consumer use, the category confusion costs almost nothing. The user is one click from the answer. The cost of being wrong is small. The same person who asks the question evaluates the response. If a recipe calls for an ingredient that doesn’t exist, the cook notices when they read it.

For consequential work, the same confusion is malpractice. Consequential work has a property that consumer chat doesn’t: the output goes downstream. It gets pasted into a deal document. It ends up in front of a regulator. It lands on someone’s review pile. The person who asked the question is not the person evaluating the answer.

Three things follow:

The user can’t evaluate the output. They aren’t the domain expert; they are the conduit between the model and the next reader.

They are trusting the model to be right.

The model has no obligation to deserve that trust.

When you ship ungrounded generation into work where the consequences land on someone other than the prompter, you are using a research tool where what was needed was a citation tool. The interfaces are identical. The damage is not. The buyer who reads the output and finds it wrong is not the buyer who pays you. The seller is the buyer who pays. The buyer at the next deal — the one whose trust the work is supposed to earn — is the one who pays the cost of the category mistake.

The fix is not better hallucination scores

If you accept that there are two categories of AI, and the issue is which one you’ve shipped into a workflow that needs the other, the fix changes shape.

The wrong fix is more training data, sharper RLHF, tighter evaluation rubrics for hallucination rate. These optimize a metric that isn’t measuring the right thing. A 92%-grounded answer in a security questionnaire response is still an 8% chance the buyer reads something the seller can’t defend. The metric improves. The category mistake doesn’t.

The right fix is structural. Cite. Refuse to generate when retrieval fails. Surface gaps explicitly. Make the citation the load-bearing part of the output, not a footnote.

In practice that means: every claim points to a source the user can read. Every gap is declared, not papered over. Confidence scores measure retrieval coverage, not output plausibility. The output format makes citation visible by default and inconvenient to remove.

The argument against this position usually runs but the user wants speed. The user does want speed. The user also wants the next deal. Speed without grounding gets the user faster wrong answers, and faster wrong answers are louder than slow right ones — they carry further, get forwarded more, and damage the next deal in proportion to how confidently they were generated.

Where this lands for what we build

This argument is the foundation of the third Stones AI principle, Show your work. It also constrains what we’ll build under Trust is the product — a buyer of our products is buying defensibility, and defensibility requires a paper trail that goes back to a source.

Concretely: every answer VTTD drafts carries a citation back to the source document and section. When VTTD can’t find the answer in the source set, it declines to generate one. There is no setting that allows it to fill a gap with statistical inference. The product is not optimized for hallucination rate. The product is structured so that hallucination, in the strict generative sense, is not a failure mode it can have.

This is not a quality position. It is a category position. We are not selling a research tool with low hallucination. We are selling a citation tool, in a category where the alternative isn’t better generation but generation, full stop. Removing the citation wouldn’t make our product worse. It would make it a different product, in a different category, and the buyers we serve wouldn’t buy it.

Stop optimizing the hallucination knob. Redesign the room.

All essays Get in touch