Current Large Language Model Artificial Intelligence and More
Some essentials regarding AI, current and future, with no technobabble. Short, and possibly sweet.
1 – World Corpus of Information:
The body of information available to mankind is vast, spread across the Internet, libraries, corporate holdings, government holdings, private holdings, monasteries, even caves, and numerous other locations. It exists in different electronic formats, written manuscripts on parchment, paper, microfiche, and more. Locating all of it would likely be an impossible task for an army of people. Historically, more information may have been lost than currently exists due to natural disasters, fire, floods, mould, vandalism, war, book burning, and other attempts to eliminate information. The preservation of information faces numerous challenges. Did I mention earthquakes?
Additional factors related to the data itself include errors, misinformation, disinformation, and contradictions. This is the state of the world concerning published information. A significant amount of what we believe and what has been recorded is incorrect. Populating a database inevitably involves the "garbage in, garbage out" (GIGO) principle, a challenge for anyone except, perhaps, an omniscient being. So, I have my doubts.
2 – Curation of the Input Data:
For large language model artificial intelligence (LLM AI), only a small subset of the vast information available is used to populate its data stores. Those creating the AI may mistakenly believe they have a large database when, in reality, it's only a tiny subset. In current LLM AI, data selection assumes that the data is available and in electronic format. However, there's no guarantee that much of this information will be part of the database. Understanding the data presents challenges for curators, and various factors such as availability, access, biases, group pressures, institutional pressures, incentives, and corporate pressures influence the curation process—determining what gets included and discarded.
3 – Training the LLM AI:
Once data is in the data stores, irrespective of the technology or data model used, it must be employed to train the AI. Information is processed through various algorithms, simulating neural networks, with a reinforcement process involving human intervention and judgment. Favourable responses are reinforced, allowing the AI to learn what outputs to produce. While this is a simplified overview of the training process, it's important to note that biases and human understanding play crucial roles. Manual intervention by trainers to reinforce certain responses introduces human judgment and is subject to pressures found in the curation process.
4 – Run Time:
The large language model AI, populated with data and trained through reinforcement, produces output. Despite often nonsensical and untrustworthy outputs, it is remarkable that the structure can generate coherent and sometimes correct results. The output is based on a traversal of words and their frequencies, which may seem unreasonable or even impossible, but it works to some extent. Similarly, the human brain, a mystery in itself, also works to some extent.
5 – Output from LLM AI:
When using LLM AI, user input, commonly known as a prompt, is fed into the system, and output follows. The characteristics of the output include being grammatically correct, coherent, to some extent random, general, often unfocused and irrelevant, sometimes correct, and occasionally widely inaccurate. The more detailed the output, the more inaccurate the response seems to be in my experience.
6 - Observations on the Future
In discussing large language model AI, it's conceivable that any artificial intelligence would face similar challenges. While future AIs may improve information retrieval, they will still deal with data full of errors and contradictions, maintaining the applicability of the GIGO principle.
The notion of AI training itself raises questions about the process. It remains unclear how a self-training AI, pulling itself up by its own bootstraps, could effectively achieve this.
Research seems to be moving in the direction of enhancing AI's ability to produce sensible, correct, and less random responses, aligning with our current understanding. However, it's essential to acknowledge that AI output can often be inaccurate and irrelevant despite the given prompt.

In the shadow of war: Ukraine as the great reset laboratory of the global tech elite . . .
https://cwspangle.substack.com/i/135302021/in-the-shadow-of-war-ukraine-as-the-great-reset-laboratory-of-the-global-tech-elite