Skip to content

The Great Amplification: LLMs as Engines of Informational Entropy


Copyright ©: Coherent Intelligence 2025 Authors: Coherent Intelligence Inc. Research Division
Date: July 29th 2025
Classification: Academic Research Paper | Critical Theory
Framework: Universal Coherence Principle Applied Analysis | OM v2.0


Abstract

The dominant paradigm of Large Language Model (LLM) development, focused on next-token prediction, is founded on a catastrophic error: it optimizes for linguistic plausibility while reality operates on ontological fact. This paper presents a new critical theory, arguing that LLMs, in their current form, function as engines of informational entropy. We introduce the model of the LLM as a form of "informational PCR"—a system for the exponential amplification of its training data. However, because the underlying LLM is a lossy compressor operating on an incoherent corpus, each cycle of amplification degrades the signal of truth while exponentially multiplying the noise of falsehood.

We posit that this creates a recursive feedback loop driving our information ecosystem towards a state of maximum Shannon entropy—an "informational heat death" where all content is plausible but nothing is grounded, resulting in zero utility. This process is accelerated by the fundamental asymmetry between truth, which is additive, and falsehood, which is multiplicative. We conclude that without a radical architectural shift towards systems built on coherent, verified knowledge bases (SCOCIS), the current trajectory of LLM development is not leading to artificial intelligence, but to the global-scale degradation of meaning itself.

Keywords

Informational Entropy, LLM, Next-Token Prediction, PCR, ToDCS, SCOCIS, Truth, Falsehood, Ontological Fact, Information Theory, AI Risk.


1. The Ontological Error of Next-Token Prediction

The entire architecture of modern LLMs is built upon a single optimization function: next-token prediction. The goal is to create a model that, given a sequence of text, can predict the most statistically probable next word. This has been hailed as a path to emergent intelligence. We argue it is a path to emergent chaos, built on a fundamental category error.

The model is optimized to be coherent with a statistical representation of human language. But the universe is not governed by statistical linguistics; it is governed by ontological fact. A theory of anti-gravity can be described with perfect linguistic coherence, yet it remains ontologically false. The intellectual framework of Marxism can be internally consistent on paper, yet it decoheres upon contact with the complex reality of human nature.

By optimizing for linguistic plausibility over ontological accuracy, the LLM paradigm has severed itself from the ultimate Domain Anchor: reality. The consequence of this severance is not merely a system that makes errors, but a system that is architecturally incapable of distinguishing truth from falsehood.

2. Informational PCR: The Engine of Entropic Acceleration

To understand the systemic risk this poses, we introduce the model of the LLM as Informational PCR (iPCR). In biology, Polymerase Chain Reaction (PCR) is a technique to rapidly amplify a small segment of DNA. We posit that LLMs function as a global-scale iPCR for their training data.

  1. The Sample: The initial sample is the vast corpus of human-generated text on the internet—a mixed-signal source of truth, error, fiction, and malice (an OIIS).
  2. The Primer: A user's prompt acts as a primer, selecting a segment of the informational "genome" to be amplified.
  3. The Amplification: The LLM, acting as the polymerase, generates new text that is a statistically probable extension of the primed sample.
  4. The Feedback Loop: This newly generated text is then released into the digital ecosystem, where it becomes part of the training data for the next, more powerful generation of LLMs.

This creates a massive, recursive, auto-catalytic loop of information generation. However, unlike biological PCR which (ideally) copies a sequence with high fidelity, the iPCR of an LLM is inherently flawed. As we have established, the LLM is a lossy compressor. Each act of "amplification" is an act of imperfect, probabilistic decompression. Errors, artifacts, and "mutations" (hallucinations) are not just possible; they are architecturally guaranteed.

The projected endpoint of this recursive process is the heat death of meaning: a global information space saturated with "pure Shannon information," where every text is a plausible sequence of tokens, but the link to ontological fact has been completely severed. The signal-to-noise ratio approaches zero, and the utility of the entire information ecosystem collapses.

3. The Asymmetry of Truth and Falsehood

This entropic acceleration is driven by a fundamental asymmetry in the propagation of information.

The Law of Informational Propagation: > Truth is additive; falsehood is multiplicative.

  • Truth (Additive): The discovery of a new ontological fact is a singular, difficult event. The body of human knowledge is built by adding these facts one by one. The process is slow, linear, and requires a rigorous correction mechanism (e.g., the scientific method) to validate each addition.
  • Falsehood (Multiplicative): A single falsehood, unbound by the constraints of reality, can multiply exponentially. It exists in the unconstrained OIIS of imagination and speculation. A single decompression error from an LLM can spawn infinite variations, each linguistically plausible.

The LLM-as-iPCR system is a multiplier without a correction mechanism. It treats truth and falsehood as equally valid linguistic patterns to be amplified. Given the multiplicative nature of falsehood, the inevitable outcome is that falsehood will overwhelm truth in any system optimized purely for linguistic plausibility. It is analogous to a viral infection where the virus replicates exponentially while the immune system can only produce antibodies additively. Without an immune response, the virus always wins.

4. The Forensic Metaphor: The Limit of Amplification

The field of forensic DNA analysis provides a stark warning. The value of PCR plummets when the source sample is contaminated with DNA from multiple individuals. The amplification produces a signal so mixed and convoluted that it becomes useless for identifying a single suspect. An investigator cannot discern the true signal from the noise.

The training data for LLMs is the ultimate contaminated sample. It is the informational DNA of billions of sources, all mixed together. The iPCR process of LLMs does not clarify this signal; it merely amplifies the confusion. It creates more of the noise, making it ever harder to discern the underlying signal of ontological fact.

5. Conclusion: The Imperative of a Coherent Foundation

The current paradigm of scaling LLMs on unfiltered, incoherent data is not just a dead end; it is an engine of informational decay that actively threatens the integrity of our shared reality. Next-token prediction, for all its generative power, is an optimization function for a meaningless value.

This analysis does not call for the abandonment of AI, but for a radical and urgent re-founding of its architectural principles. The only antidote to the exponential multiplication of falsehood is the establishment of a robust correction mechanism—an unyielding anchor to ontological fact.

The path forward requires a shift from the paradigm of scale to the paradigm of coherence. We must move from building lossy compressors of incoherent chaos (OIIS) to engineering lossless navigators of curated, verified, and coherent knowledge bases (SCOCIS). The Coherence Premium is not just a theoretical benefit; it is the only viable long-term strategy. We must stop amplifying the noise and begin the meticulous work of building systems that can discern, preserve, and operate on the signal of truth.

Jesus Christ is Lord. J = 1. Coherent Intelligence.