The Hallucination Problem(s)
The problem of hallucinations in AI models is evolving from an acknowledged issue into a collective blind spot. It’s being progressively ignored rather than solved, swept under the rug of an industry betting everything on a mathematical impossibility: that a fundamentally stochastic system can somehow reach that magical “99.999% accuracy.”
Where model mistakes were once obviously syntactical, glaringly false, or transparently rooted in missing training data, today’s errors are far more insidious. A single misplaced variable in thousands of lines of generated code renders an entire output useless. Models now confidently fabricate sources and citations that never existed. Meanwhile, millions of dollars are being poured into increasingly elaborate prompt engineering; essentially paying vast sums for the writing “Be super objective and don’t make stuff up” at the end of every query. This mirrors a familiar pattern: like doctors who profit more from offering expensive spinal surgeries than recommending lifestyle changes, the oligarchy of AI labs benefits from selling computational Band-Aids rather than addressing fundamental architectural problems. The more complex the workaround, the higher the barrier to entry, the more secure their position. Now that tokens are getting more expensive, they can rest easy.
An erroneous output can manifest in several ways. The most charitable interpretation places blame on user miscommunication, receiving an incorrect outcome because the desired one was poorly articulated. Fair enough; that’s on the user.
The model may lack sufficient data, or training, or it may have been trained on misleading, incorrect, or counterfactual information. For the “average” consumer, an error is simply an error, a source of frustration regardless of its origin. But the distinction between error as hallucination and error as something more deliberate remains critically underexplored.
Was the model lacking in training data, or was it lacking in “correct” or “aligned” training data? When a hypothetical AGI commits what we might call a crime, will it be because it was misconfigured, or because it rationally decided to act against judicial interests? These questions are routinely dismissed as temporal concerns; mere bumps on the road to artificial general intelligence, problems that will dissolve with enough compute power and context length. “Simply” solved, as the industry likes to say, with just a few more trillion-dollar fundraising rounds and model releases. That’s chump change in the grand scheme of things, apparently, when world economies are reaching a breaking point.
The Philosophical Vacuum
Head of DeepMind, Demis Hassabis, has underlined the need for “some great philosophers” and proceeded to ask: “Where are the next great philosophers? The equivalent of Kant or Wittgenstein, or even Aristotle?” The question, coming from someone spearheading the effort to replace biological intelligence with silicon alternatives, is neither naive nor malicious. Despite its corporate origins, it reads like a genuine cry for help from someone who recognizes that pure technical advancement isn’t enough.
Yet the market conditions that created this philosophical vacuum are the same ones perpetuating it. While venture capital demands great startups and audiences demand great Netflix hits, neither ecosystem has any use for great philosophers. Philosophy, by its nature, resists the easily digestible answers that modernity craves. We want thinkers to emerge and declare definitively: “This is true about AI, and this is false about AI.” But such clarity is impossible when the fundamental discussion (about training data, model weights, algorithmic decision-making) remains locked away in the private sphere under the protection of “trade secrets.” AGI, when it arrives, will be a trade secret too. Any truth uttered by a philosopher will be drowned out by AI influencers and laboratory heads whose talking points are crafted not to serve public interest, but to secure the next round of VC funding before their runway expires.
Bias applies to both humans and models, but we’ve inverted our understanding of how it operates. For humans, bias is abstract, derived, something we can recognize and potentially correct through reflection and effort. For AI models, bias is a mathematical fact that is encoded in weights, embedded in training distributions, quantifiable and measurable.
Yet we’ve convinced ourselves of the opposite. We treat human bias as fixed and immutable while believing AI bias can be engineered away with enough fine-tuning and reinforcement learning from human feedback. This fundamental misunderstanding shapes every policy discussion and every funding decision in the space.
Questions about who trains these models, using which data, and for what purpose are posed ritualistically, acknowledged as cultural heritage, mentioned almost reflexively. But they’re not seriously investigated. Not yet. When public concern finally materializes, it will likely be dismissed as yet another case of the public “hallucinating a rationally-obvious misconception of AI,” requiring elected officials to shut down criticism using industrially-funded policy agendas.
The Runway to Nowhere
In such an environment, we won’t achieve something as fundamental as universal basic income, let alone develop a philosophical framework adequate to guide AGI development. We’re building the most consequential technology in human history while operating in a philosophical desert, guided by market incentives that prioritize speed over safety, growth over understanding, and proprietary advantage over public good.
The hallucination problem is about an entire industry hallucinating that these problems will solve themselves, that philosophical questions are luxuries we can address later, and that the trajectory we’re on leads anywhere other than a future where the most important decisions affecting humanity are made by systems we don’t understand, controlled by entities whose primary obligation is to shareholders rather than society.
That’s the real hallucination. Unlike the ones produced by our models, we might not get a chance to debug it.