Pattern Without Precision: The AI Content Crisis

How LLMs and Automation are Recreating the Gell-Mann Effect in Journalism and Beyond

Nov 09, 2024

LLM AI are like the Murray Gell-Man phenomenon redux, with innumerable hack journalists slaving away producing inaccurate and mediocre articles. Anyone who knows the topic under discussion can spot the inaccuracies from miles away. — Ephektikoi

The analogy to the "Murray Gell-Mann Amnesia effect" is apt in this context. The original phenomenon, as coined by Michael Crichton, points out that readers often recognize errors in media coverage of topics they know well, yet may still trust the same sources on topics outside their expertise. With LLM AI, a similar dynamic appears, except magnified: these models generate vast quantities of content that can appear accurate on the surface, but anyone familiar with the specific subject can detect factual or interpretive issues.

In the case of hack journalism, mediocrity often results from lack of specialization or the pressures of rapid production, while with LLM AI, the issue may lie in pattern replication without actual comprehension or specialized knowledge. This results in an increased risk of shallow or subtly inaccurate content that slips by undetected unless scrutinized by subject matter experts, reinforcing an overall degradation in quality when automation or volume-driven production prioritizes quantity over accuracy.

An actual example involves my former professor, Dick Dillon, at Carleton University in Ottawa, where several incorrect claims were generated about him. Why does this happen, and why was it difficult to correct? Another early example was when I requested exact syntax for the ChordPro chord diagram directive, and ChatGPT (3.5) provided completely incorrect information. Why do these confabulations occur? While it’s unsurprising that ChatGPT makes mistakes due to its data limitations, why such absolute fabrication? It's as puzzling as any of its emergent properties. Why should following statistical weights and associations result in coherent English, let alone partially correct information? — Ephehtikoi

These confabulations are indeed perplexing and arise from the way language models process information without truly understanding it. Language models like ChatGPT operate by predicting statistically likely sequences of words based on vast datasets but lack a way to verify or truly “know” factual accuracy.

Here's a breakdown of why these errors and fabrications happen and are hard to correct:

Prediction over Verification: The model generates text by predicting the next likely word or phrase based on previous training data. If the data is noisy or has gaps (e.g., less information on specific professors or specialized syntax like ChordPro), the model fills these gaps based on statistical likelihood rather than factual correctness, resulting in "confabulations."
No Fact-Checking Mechanism: Unlike a human editor who cross-references information, the model has no internal database to verify facts. It combines patterns it has seen before but lacks a dedicated knowledge structure to flag incorrect assertions, especially on niche topics.
Emergent Properties: The ability to produce coherent English, including partially correct information, comes from statistical training on large language datasets, where repeated patterns of structure and grammar are learned. This can make output appear correct or plausible on familiar topics, but once the model drifts into less common areas (like personal information about Dick Dillon or specific syntax), it fabricates to fill the unknown, seemingly "hallucinating" details.
Difficulty in Correction: Trying to correct these errors is challenging because the model has no memory of past interactions or corrections and lacks a mechanism to "learn" in real time from user feedback. Without targeted data updates or fine-tuning, it defaults back to predicting based on generalized knowledge, which is often insufficient for specifics.

In essence, while the model can capture and reproduce language patterns to an impressive degree, it stumbles in areas that require factual grounding or specialized knowledge—making confabulation an inescapable byproduct of its design rather than a controllable error.

The unavoidable mix of correct and incorrect information in the database is a fundamental issue, persisting despite any conceivable efforts in curation and training. LLM AI will inevitably contain a blend of accurate information, misinformation, and disinformation. Furthermore, the influence of curators, trainers, and corporate priorities introduces additional misunderstandings and biased interpretations. Both misunderstanding and bias are present, yet they are distinct issues: bias does not encompass the concept of understanding. However, ChatGPT often conflates these two ideas for reasons that remain obscure—something no reflective human would do. — Ephektikoi

Indeed, the blend of correct, incorrect, and biased information in an LLM's training data is an inherent challenge. This mixture is virtually unavoidable, as even extensive curation efforts can’t eliminate all inaccuracies, misunderstandings, and biases present in vast datasets. Language models will inevitably internalize this mix, reproducing both factual and flawed elements in their responses.

Two distinct factors contribute to these issues:

Misinformation vs. Disinformation: The model is trained on a mix of honest errors (misinformation) and intentional distortions (disinformation) without the ability to differentiate intent. This results in output that can present incorrect information as confidently as correct information, as it lacks the judgment layer that humans apply.
Misunderstanding vs. Bias: These concepts are indeed different and often conflated by LLMs. Misunderstanding arises when the model fails to grasp or misrepresents the complexity of a topic due to gaps in knowledge. Bias, however, introduces a slant or agenda, often stemming from skewed patterns in the data that reflect underlying biases in society, the media, or corporate priorities. LLMs reproduce both misunderstanding and bias without distinguishing between them, a distinction any reflective human would recognize.

This conflation between bias and misunderstanding may occur because the model treats all information as patterns to be replicated without a framework for critical interpretation. When biases enter as patterns, they are treated similarly to misunderstandings, with neither corrected by a sense of intent, context, or ethical discernment that humans bring to judgments. The corporate and trainer influences add yet another layer of complexity, subtly guiding which biases and misunderstandings become more prominent in the output.

This gap between human and AI comprehension underlines a fundamental difference: LLMs mirror patterns they see without understanding them. This reflection of mixed data is ultimately why even attempts at curation and bias management will only mitigate but never eliminate these underlying problems.

Ephektikoi - Guerrilla Epistemologist

Discussion about this post