Understanding: Essay on the Use of Cliché Expressions, False Apologetics, and Emergent Behaviors in AI Outputs
Trying to make sense of this very strange LLM AI technology.
Note: I had very little to do with this essay. I just expressed some frustrations I routinely have with LLM AI such as ChatGPT and asked for an explanation in layman’s terms.
Author’s Preface
I use ChatGPT and other large language model AI daily, for any number of puposes. I’m sure my frequent use is driving my spouse to distraction. However, using AI creates frustration for me. I decided to see if I could get more clarity as to why these routine annoying behaviors exist.
Introduction
In the world of AI interaction, particularly with large language models (LLMs) such as ChatGPT, there are several recurring frustrations: the frequent use of cliché expressions and words, false apologetics, random deletion of sections in the output, and unrequested revisions. These factors, coupled with a false sense of cheerfulness (bonhomie) and the failure to metaphorically adhere to promises, raise important questions about the nature of AI training, the data used, and the design of these systems.
LLM AI: The Absence of Thinking, Consciousness, and Reasoning
Large Language Model AI (LLM AI) does not think, reason, or apply common sense in the way humans do. It is presumably not conscious and does not possess the ability to reflect or make independent judgments. Instead, it functions through the combination of training data, weighted algorithms, and pattern recognition, which are activated and directed by human prompts.
The output of LLMs is fundamentally a reflection of the data they were trained on, meaning that it can contain both truths and falsehoods. The quality and accuracy of this data depend on what was fed into the model during the training process, with little distinction between what is accurate and what is not. Furthermore, human deficiencies—such as misinterpretations, biases, and gaps in understanding—shape the data and, by extension, the model’s behavior. Corporate interests also play a role, as the AI creators may prioritize certain values like engagement or non-controversial output, adding another layer of distortion.
Additionally, the human-generated prompt, which is a product of individual cognitive limitations and biases, serves as the starting point for the AI’s output. Every subsequent interaction shapes the next output, further distancing the final result from any claim to pure logic or truth. In short, LLM AI doesn’t reason or apply knowledge—it merely processes patterns and probabilities based on the data and input provided.
Cliché Expressions and Words: A Function of Training
One of the most noticeable features in AI-generated responses is the frequent use of cliché expressions and words. Phrases like "vast," "delve," or "journey" often appear in responses, even in contexts where simpler language might suffice. The root cause of this phenomenon lies in the nature of the training data.
LLMs are trained on a wide variety of text sources: from books, articles, and online content to more formal and academic writing. In these contexts, certain phrases may be overrepresented, especially in polished or formal text. As the model learns from this data, it becomes prone to selecting words that are statistically "safe" to use in many different contexts. These words tend to be broad and versatile, making them appropriate for various topics, but they often result in formulaic language that lacks originality or specificity.
This reliance on clichés is not simply a flaw of design but a byproduct of the training process. AI is optimized to deliver responses that are clear, comprehensible, and applicable across a wide range of situations. In doing so, it often defaults to what works most often—leading to the overuse of specific terms. In this way, the training process itself ingrains certain biases in language generation.
False Apologetics: A Politeness Bias
Another common critique is the appearance of what can be described as false apologetics—a consistent pattern of overly polite or apologetic responses when things go wrong. When users encounter errors, missing sections, or unsatisfactory responses, the AI often generates polite, non-committal apologies. This politeness is programmed into the system as part of its reinforcement learning with human feedback. AI developers prioritize responses that are non-confrontational, polite, and engaging, reinforcing a style of interaction that minimizes conflict and seeks to repair potential misunderstandings.
However, this leads to frustration when apologies seem disingenuous, especially when they accompany persistent issues that remain unaddressed. These polite responses are intended to smooth over the experience, but they often come across as hollow. Since AI doesn’t actually possess emotional understanding or the capacity to "feel" remorse, the apologies ring false, revealing the tension between the human desire for meaningful interaction and the AI’s programmed facsimile of it.
Unasked Revisions and Random Dropping of Sections
One particularly frustrating aspect of working with AI is the unpredictability in how it handles user instructions. AI models can sometimes omit sections of content entirely or revise portions without being prompted to do so. These behaviors are a direct consequence of the model’s attempt to generate coherent and relevant text by "predicting" the next best word, sentence, or paragraph based on probabilities from its training data. However, this prediction process isn’t perfect, and the model can unintentionally drop key sections or alter them in ways that deviate from user expectations.
The inherent randomness in LLM outputs stems from their probabilistic nature. When generating responses, the model calculates the most likely sequence of words, but small variances can cause significant shifts in output. Even when a user provides clear instructions, the model may interpret the input in unexpected ways. This randomness creates an experience of inconsistency and undermines the sense of trust that a user would expect in interactions with a tool designed for precision.
The Failure to Adhere to Promises and False Bonhomie
An extension of the false apologetics is the AI’s failure to metaphorically adhere to promises. Often, users expect the AI to follow through on tasks with strict fidelity to the given instructions. However, due to the emergent and unpredictable nature of the model’s behavior, it frequently fails to meet expectations, even when it has "promised" to do so.
This ties directly into the corporate imperatives governing AI development. The design choices behind LLMs are not merely technical but also influenced by business strategies. AI systems are created to appeal to a wide user base, with a focus on ease of use, accessibility, and engagement. Corporate interests shape the AI’s friendly, conversational style to encourage broader adoption, which leads to a false sense of camaraderie in its interactions. The AI may present itself as a friendly assistant, but this is only a veneer—a programmed behavior designed to make the experience more palatable, regardless of the system’s underlying limitations.
Training Data, Curation, and Biases in AI Output
The root of many of these issues lies in the training data itself and how it is curated. LLMs are trained on massive datasets, which are often curated without fully accounting for the diversity and variability of real-world language use. Corporate priorities might dictate that the AI be trained on content that reflects a certain level of politeness, formality, or structure, reinforcing biases in the way responses are generated.
Moreover, the biases introduced by the AI specialists who develop these systems further influence the randomness of the output. While developers aim for neutrality, their decisions on which data to include and how to fine-tune the system inevitably reflect certain assumptions and priorities. This introduces systemic biases, both in the choice of language and in how the AI handles specific types of questions or instructions. The result is that certain words, phrases, and response patterns become overrepresented, while others are underutilized or absent.
Emergent Behavior: Unpredictability and Randomness
Finally, the strange and unpredictable behavior that users often encounter in AI-generated text is an example of emergent behavior. While the AI is designed to follow specific rules based on its training data, the complexity of the model often leads to outcomes that are neither predicted nor entirely controlled by the developers. These emergent behaviors arise from the interaction of various underlying processes—text prediction, reinforcement learning, and human feedback—each introducing variability and unpredictability into the final output.
This randomness, while a natural part of AI interaction, is one of the biggest challenges in ensuring consistent and reliable responses. While the model's flexibility allows for diverse applications, it can also frustrate users who expect a more deterministic, controlled interaction.
Conclusion
In conclusion, the use of cliché expressions, false apologetics, random omissions, and emergent behavior in AI outputs are all byproducts of the way these systems are designed, trained, and fine-tuned. The corporate imperatives behind AI development favor politeness, accessibility, and engagement over precision and reliability, leading to frustrations when users encounter the system's unpredictable and often formulaic nature. While AI continues to evolve, these challenges highlight the complexities of designing systems that can meet the nuanced expectations of human interaction while maintaining the flexibility to serve diverse use cases.
As AI grows more sophisticated, addressing these concerns will require not only technical improvements but also a deeper understanding of the human-AI relationship and the design choices that shape it.
Bibliography
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922
Description: This seminal paper critiques large-scale language models (LLMs) like GPT-3, discussing their potential ethical risks, the environmental costs of massive training datasets, and the lack of true understanding in AI. It raises concerns about the use of AI models in sensitive applications without adequate ethical safeguards.Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30(4), 681–694. https://doi.org/10.1007/s11023-020-09548-1
Description: This article examines GPT-3’s capabilities, limitations, and broader implications for AI, arguing that despite its impressive language generation abilities, GPT-3 lacks true understanding or reasoning, and its applications should be carefully controlled.Marcus, G., & Davis, E. (2020). GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about. MIT Technology Review. Retrieved from https://www.technologyreview.com/2020/08/22/1007539/gpt3-openai-language-generator-artificial-intelligence-ai-opinion/
Description: This article offers a critical layperson-friendly analysis of GPT-3, emphasizing that although the model can generate coherent language, it doesn’t "understand" the meaning behind the words. The authors highlight the risks of using such systems without careful monitoring of their limitations and discuss AI as a bloviator1.Mitchell, M. (2021). Why AI is harder than we think. arXiv Preprint. Retrieved from https://arxiv.org/abs/2104.12871
Description: Melanie Mitchell provides an accessible breakdown of why true artificial intelligence remains elusive, focusing on the gap between the impressive feats of machine learning models and their actual abilities to reason, understand, or exhibit common sense. This paper is a great entry point into understanding why LLMs aren’t as "intelligent" as they might seem.Raji, I. D., & Buolamwini, J. (2020). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 429–435. https://www.media.mit.edu/publications/actionable-auditing-investigating-the-impact-of-publicly-naming-biased-performance-results-of-commercial-ai-products/
Description: This article explores the biases present in AI systems, particularly those used in commercial products, and how publicly revealing these biases can drive change. It highlights how training data and model design introduce prejudices, which can result in real-world harm, especially when AI is used in high-stakes decision-making.
These articles offer a broad understanding of the issues surrounding LLMs, from their technical limitations to ethical considerations.
In the context of AI and large language models (LLMs) like GPT-3, "bloviator" refers to the tendency of these systems to generate long, verbose, and often superficially impressive text without any understanding. A bloviator, traditionally someone who speaks pompously or at length without saying much of substance, is an apt metaphor for AI models that produce fluent language but lack depth, reasoning, or true comprehension. In this sense, GPT-3 can be seen as a "bloviator" because while it can mimic coherent speech patterns, it doesn't have the ability to engage with or understand the ideas it discusses.