AI Hallucinations

OpenAI recently put out a really interesting paper and summary on why large language models hallucinate, which you can read here.

I think we've probably all experienced hallucinations. In fact, we probably experience them every day whenever we're using AI.

Here's what's actually happening when AI hallucinates: it's pattern matching gone wild. AI models predict what word should come next based on everything they've seen before. Sometimes that means they fill in gaps with things that sound right but aren't actually true. It's like having a really smart friend who can't admit when they don't know something, they'll just make up an answer that sounds plausible.

The OpenAI research highlights a crucial issue--this isn't really a bug we can just patch away. The same capabilities that let AI be creative and flexible also make it prone to fabrication. When you ask for a creative story, you want novel combinations of ideas. When you ask for facts about your industry, you want accuracy. But the AI is using the same prediction process for both tasks.

The training data itself is part of the problem. These models learn from enormous amounts of text, some accurate, some outdated, some just plain wrong. They can't tell the difference between a peer-reviewed study and someone's random blog post from 2015. It all gets mixed together in the training process.

There is also something researchers call "exposure bias." The AI generates one word based on real training data, but then it has to generate the next word based on what it just created. Small errors compound. By the end of a paragraph, it might be completely off in fantasy land, but still sounding authoritative.

AI systems are optimized to be helpful and coherent, not necessarily factual. They're designed to give you something that sounds good rather than admitting uncertainty. That's a fundamental design challenge still being worked through.

The AI companies understand this is a major issue that they need to solve, hence OpenAI’s recent paper. OpenAI, Anthropic, and others are pouring resources into solutions. Some of the most promising approaches include making AI check external databases before responding (called retrieval-augmented generation or RAG), training models on carefully verified datasets, and building in ways for AI to express uncertainty instead of false confidence.

I'm particularly interested in the work on "uncertainty tokens.” These basically teach AI to say "I'm not sure" instead of making something up. Imagine how much more useful these tools would be if they could reliably tell you when they're guessing versus when they're certain.

Until we get there, though, we need to be smart about how we use these tools. Here's what's working for me and others in our LA-AI community: Treat AI like a brilliant but unreliable intern. Watch for red flags. Overly specific details are often a giveaway, like when AI gives you exact percentages to two decimal places for something that probably wasn't measured that precisely. Same with citations that sound too perfect or technical terms that seem slightly off.

Test the AI on things you already know. Before I use a new model for research in an unfamiliar area, I'll ask it questions about topics I know well. Helps calibrate my BS detector.

Always ask for sources explicitly in your prompts. While the AI might still make up sources, I've found this reduces outright fabrication. Plus, it makes verification easier when you have specific claims to check.

Weekly Insights