Understanding the World: Journalists, Large Language Model AI, and the Murray Gell-Mann Effect
Accuracy in Reporting: Journalists versus Large Language Model AI. Is there any difference?
Note: This essay was prepared with the research assistance and ghostwriting of ChatGPT 4.0. No LLMAI were harmed in the process, although I felt inclined to threaten them from time to time.
Author's Preface
Use of ChatGPT for research and ghostwriting
I've been using ChatGPT, a large language model, for assistance in research and ghostwriting various articles and essays since early in the year when it was first made publicly available.
ChatGPT’s tendency to hallucinate or confabulate information
I've been aware that it often hallucinates or confabulates information, giving you incorrect data, always with a veneer of respectability since the language is generally impeccable.
ChatGPT providing incorrect data about Oracle Forms
This morning, I was looking into information on some software products I had used extensively in the past, specifically Oracle Forms, and found that ChatGPT was confidently incorrect about details I knew well from personal experience.
The role of statistical patterns in ChatGPT’s output
How does ChatGPT come up with its information? It looks for statistical patterns in imperfect datasets, generating coherent output that may or may not be accurate.
Errors in data and algorithms of large language models
Errors can arise from two primary sources: incorrect data in the training set and imperfections in the algorithmic process itself.
Comparison of human understanding with AI-generated content
It’s a great mystery how this AI, despite its statistical grounding, can create anything that approximates human knowledge, even though no human truly understands it at a deep level.
The mystery of how AI generates coherent output
Like the "hard problem" of human consciousness, the way AI generates coherent responses is perplexing, though clearly different from how human thought works.
The Murray Gell-Mann effect
The Murray Gell-Mann effect describes how people can recognize media inaccuracies in areas they are familiar with but trust other parts of the same media without question.
Trust in media reporting versus personal knowledge
In areas where we don’t have direct knowledge, we tend to trust the media more, even though we know from personal experience that journalists can get things wrong.
Application of the Murray Gell-Mann effect to large language models like ChatGPT
The same applies to AI models. We know AI can be wrong in areas where we have expertise, yet we may trust it in areas where we lack knowledge. I call this the "Murray Gell-Mann effect redux."
Introduction
As AI continues to advance, large language models like ChatGPT have become invaluable tools in various fields. However, their use raises critical questions about accuracy, trust, and the epistemological challenges they share with human-generated content. This essay explores the limitations of AI-generated output and the parallels between media credibility issues and the Murray-Gell-Mann effect, both in human journalism and AI.
Erroneous ChatGPT Output, but Polished Prose
ChatGPT is notorious for producing content that, while eloquent, is often factually incorrect. Its well-constructed prose can mislead readers into accepting inaccuracies. For instance, when I queried ChatGPT about Oracle Forms, it provided incorrect details despite presenting the information in a highly polished format. This tendency to confabulate while maintaining a respectable veneer is a significant limitation of AI.
How Large Language Models Work
Large language models rely on vast datasets and statistical analysis rather than understanding. They generate output by predicting the next likely word or phrase based on patterns identified in their training data. This process, known as "stochastic parroting," is vastly different from human cognition and remains opaque even to experts. The model doesn’t comprehend the information it produces, and this creates significant limitations.
Inadequacies in Large-Language Model AI Output
Imperfections in the Data
The datasets used to train large language models are a mix of high-quality, unreliable, and outdated data, leading to inherent inaccuracies in the output. These models are only as good as the data they are trained on, and the imperfections in that data are a major source of error.
Imperfections in the Curation Process
Human biases, misunderstandings, and corporate pressures all play a role in how data is curated for AI training. The curation process often lacks transparency and objectivity, further introducing errors into the system.
Imperfections in the Training Process
Training large language models is an imperfect science, shaped by the limitations of human understanding and the biases of those who design the training algorithms. These models are optimized for engagement rather than truth, which can lead to shallow, overly generalized responses.
Inadequacies in the LLM Paradigm
The current paradigm of AI models simulating intelligence by generating text based on statistical patterns may be inherently flawed. While these models can imitate understanding, they do not possess true intelligence or reasoning capabilities, making them inadequate for tasks requiring deep comprehension.
Inadequcies in the Prompting
Due to the random nature of the LLM AI, it ranges from difficult to impossible to get consistent results from such an AI. Developing prompts that work is just as much art as science, and there is a lot of “try this - oops doesn’t work; try that” activity. It is never clear how a prompt will be interpreted, and the same prompt will be handled differently from generation to generation. This is by design, but does not allow prompt creation to be called an engineering discipline.
The Murray-Gell-Mann Effect in Human Articles
The Murray-Gell-Mann effect, coined by Michael Crichton, highlights how we often distrust media reports on topics we are familiar with but continue to trust the same media on unfamiliar topics. This effect demonstrates how authority bias and the polished presentation of information can lead us to ignore obvious inaccuracies in areas outside our expertise.
The Murray-Gell-Mann Effect in Large Language Models
This effect can also apply to AI-generated content. Despite recognizing errors in areas where we are knowledgeable, we may still trust AI-generated responses in unfamiliar fields. The polished language and seemingly coherent output from models like ChatGPT often mask underlying inaccuracies. Much like human journalism, AI-generated content benefits from the Murray-Gell-Mann effect in that its errors are often overlooked due to our lack of familiarity with the subject.
AI Biases and Limitations
Large language models reflect the biases present in their training data and the goals of their developers. Corporate pressures to maintain user-friendly interfaces and the influence of societal biases can skew the information presented by these models. This raises concerns about the ethical implications of AI, including issues of free speech, privacy, and censorship.
Summary
Both human-generated and AI-generated content suffer from issues related to accuracy and trust. The Murray-Gell-Mann effect plays a key role in how we perceive and trust information, whether it comes from a journalist or an AI model. As we move forward, understanding the limitations of these systems and maintaining a critical eye on the content they produce will be essential.
Bibliography
Carter, J. (2011). Media credibility and the Murray Gell-Mann amnesia effect. First Things. https://www.firstthings.com/blogs/firstthoughts/2011/08/media-credibility-and-the-murray-gell-mann-amnesia-effect
This article explores how the Gell-Mann effect influences our trust in media sources, even when we recognize inaccuracies in their reporting.
Rebellion Research. (2024). What is Gell-Mann Amnesia? Rebellion Research. https://www.rebellionresearch.com/what-is-gell-mann-amnesia
This article discusses the origins of the Gell-Mann Amnesia effect and its implications for media consumption and credibility.
Tamkin, A., Brundage, M., Clark, J., & Ganguli, D. (2021). Understanding the capabilities, limitations, and societal impact of large language models. OpenAI. https://openai.com/index/understanding-the-capabilities-limitations-and-societal-impact-of-large-language-models
This paper explores the technical limitations of large language models and their societal impact, particularly focusing on biases and ethical concerns.